- 17 3月, 2020 7 次提交
-
-
由 Sean Christopherson 提交于
Drop the "const" attribute from @old in kvm_arch_commit_memory_region() to allow arch specific code to free arch specific resources in the old memslot without having to cast away the attribute. Freeing resources in kvm_arch_commit_memory_region() paves the way for simplifying kvm_free_memslot() by eliminating the last usage of its @dont param. Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Split out the core functionality of setting a memslot into a separate helper in preparation for moving memslot deletion into its own routine. Tested-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Replace a big pile o' gotos with returns to make it more obvious what error code is being returned, and to prepare for refactoring the functional, i.e. post-checks, portion of __kvm_set_memory_region(). Reviewed-by: NJanosch Frank <frankja@linux.ibm.com> Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Explicitly free an allocated-but-unused dirty bitmap instead of relying on kvm_free_memslot() if an error occurs in __kvm_set_memory_region(). There is no longer a need to abuse kvm_free_memslot() to free arch specific resources as arch specific code is now called only after the common flow is guaranteed to succeed. Arch code can still fail, but it's responsible for its own cleanup in that case. Eliminating the error path's abuse of kvm_free_memslot() paves the way for simplifying kvm_free_memslot(), i.e. dropping its @dont param. Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove kvm_arch_create_memslot() now that all arch implementations are effectively nops. Removing kvm_arch_create_memslot() eliminates the possibility for arch specific code to allocate memory prior to setting a memslot, which sets the stage for simplifying kvm_free_memslot(). Cc: Janosch Frank <frankja@linux.ibm.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
The two implementations of kvm_arch_create_memslot() in x86 and PPC are both good citizens and free up all local resources if creation fails. Return immediately (via a superfluous goto) instead of calling kvm_free_memslot(). Note, the call to kvm_free_memslot() is effectively an expensive nop in this case as there are no resources to be freed. No functional change intended. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Reinstall the old memslots if preparing the new memory region fails after invalidating a to-be-{re}moved memslot. Remove the superfluous 'old_memslots' variable so that it's somewhat clear that the error handling path needs to free the unused memslots, not simply the 'old' memslots. Fixes: bc6678a3 ("KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update") Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 2月, 2020 1 次提交
-
-
由 Marc Zyngier 提交于
Accessing a per-cpu variable only makes sense when preemption is disabled (and the kernel does check this when the right debug options are switched on). For kvm_get_running_vcpu(), it is fine to return the value after re-enabling preemption, as the preempt notifiers will make sure that this is kept consistent across task migration (the comment above the function hints at it, but lacks the crucial preemption management). While we're at it, move the comment from the ARM code, which explains why the whole thing works. Fixes: 7495e22b ("KVM: Move running VCPU from ARM to common code"). Cc: Paolo Bonzini <pbonzini@redhat.com> Reported-by: NZenghui Yu <yuzenghui@huawei.com> Tested-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/318984f6-bc36-33a3-abc6-bf2295974b06@huawei.com Message-id: <20200207163410.31276-1-maz@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 2月, 2020 1 次提交
-
-
由 Zhuang Yanying 提交于
We are testing Virtual Machine with KSM on v5.4-rc2 kernel, and found the zero_page refcount overflow. The cause of refcount overflow is increased in try_async_pf (get_user_page) without being decreased in mmu_set_spte() while handling ept violation. In kvm_release_pfn_clean(), only unreserved page will call put_page. However, zero page is reserved. So, as well as creating and destroy vm, the refcount of zero page will continue to increase until it overflows. step1: echo 10000 > /sys/kernel/pages_to_scan/pages_to_scan echo 1 > /sys/kernel/pages_to_scan/run echo 1 > /sys/kernel/pages_to_scan/use_zero_pages step2: just create several normal qemu kvm vms. And destroy it after 10s. Repeat this action all the time. After a long period of time, all domains hang because of the refcount of zero page overflow. Qemu print error log as follow: … error: kvm run failed Bad address EAX=00006cdc EBX=00000008 ECX=80202001 EDX=078bfbfd ESI=ffffffff EDI=00000000 EBP=00000008 ESP=00006cc4 EIP=000efd75 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] FS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy GDT= 000f7070 00000037 IDT= 000f70ae 00000000 CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=00 01 00 00 00 e9 e8 00 00 00 c7 05 4c 55 0f 00 01 00 00 00 <8b> 35 00 00 01 00 8b 3d 04 00 01 00 b8 d8 d3 00 00 c1 e0 08 0c ea a3 00 00 01 00 c7 05 04 … Meanwhile, a kernel warning is departed. [40914.836375] WARNING: CPU: 3 PID: 82067 at ./include/linux/mm.h:987 try_get_page+0x1f/0x30 [40914.836412] CPU: 3 PID: 82067 Comm: CPU 0/KVM Kdump: loaded Tainted: G OE 5.2.0-rc2 #5 [40914.836415] RIP: 0010:try_get_page+0x1f/0x30 [40914.836417] Code: 40 00 c3 0f 1f 84 00 00 00 00 00 48 8b 47 08 a8 01 75 11 8b 47 34 85 c0 7e 10 f0 ff 47 34 b8 01 00 00 00 c3 48 8d 78 ff eb e9 <0f> 0b 31 c0 c3 66 90 66 2e 0f 1f 84 00 0 0 00 00 00 48 8b 47 08 a8 [40914.836418] RSP: 0018:ffffb4144e523988 EFLAGS: 00010286 [40914.836419] RAX: 0000000080000000 RBX: 0000000000000326 RCX: 0000000000000000 [40914.836420] RDX: 0000000000000000 RSI: 00004ffdeba10000 RDI: ffffdf07093f6440 [40914.836421] RBP: ffffdf07093f6440 R08: 800000424fd91225 R09: 0000000000000000 [40914.836421] R10: ffff9eb41bfeebb8 R11: 0000000000000000 R12: ffffdf06bbd1e8a8 [40914.836422] R13: 0000000000000080 R14: 800000424fd91225 R15: ffffdf07093f6440 [40914.836423] FS: 00007fb60ffff700(0000) GS:ffff9eb4802c0000(0000) knlGS:0000000000000000 [40914.836425] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [40914.836426] CR2: 0000000000000000 CR3: 0000002f220e6002 CR4: 00000000003626e0 [40914.836427] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [40914.836427] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [40914.836428] Call Trace: [40914.836433] follow_page_pte+0x302/0x47b [40914.836437] __get_user_pages+0xf1/0x7d0 [40914.836441] ? irq_work_queue+0x9/0x70 [40914.836443] get_user_pages_unlocked+0x13f/0x1e0 [40914.836469] __gfn_to_pfn_memslot+0x10e/0x400 [kvm] [40914.836486] try_async_pf+0x87/0x240 [kvm] [40914.836503] tdp_page_fault+0x139/0x270 [kvm] [40914.836523] kvm_mmu_page_fault+0x76/0x5e0 [kvm] [40914.836588] vcpu_enter_guest+0xb45/0x1570 [kvm] [40914.836632] kvm_arch_vcpu_ioctl_run+0x35d/0x580 [kvm] [40914.836645] kvm_vcpu_ioctl+0x26e/0x5d0 [kvm] [40914.836650] do_vfs_ioctl+0xa9/0x620 [40914.836653] ksys_ioctl+0x60/0x90 [40914.836654] __x64_sys_ioctl+0x16/0x20 [40914.836658] do_syscall_64+0x5b/0x180 [40914.836664] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [40914.836666] RIP: 0033:0x7fb61cb6bfc7 Signed-off-by: NLinFeng <linfeng23@huawei.com> Signed-off-by: NZhuang Yanying <ann.zhuangyanying@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 31 1月, 2020 2 次提交
-
-
由 Boris Ostrovsky 提交于
__kvm_map_gfn()'s call to gfn_to_pfn_memslot() is * relatively expensive * in certain cases (such as when done from atomic context) cannot be called Stashing gfn-to-pfn mapping should help with both cases. This is part of CVE-2019-3016. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: NJoao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Boris Ostrovsky 提交于
kvm_vcpu_(un)map operates on gfns from any current address space. In certain cases we want to make sure we are not mapping SMRAM and for that we can use kvm_(un)map_gfn() that we are introducing in this patch. This is part of CVE-2019-3016. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: NJoao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 1月, 2020 16 次提交
-
-
由 Sean Christopherson 提交于
Avoid the "writable" check in __gfn_to_hva_many(), which will always fail on read-only memslots due to gfn_to_hva() assuming writes. Functionally, this allows x86 to create large mappings for read-only memslots that are backed by HugeTLB mappings. Note, the changelog for commit 05da4558 ("KVM: MMU: large page support") states "If the largepage contains write-protected pages, a large pte is not used.", but "write-protected" refers to pages that are temporarily read-only, e.g. read-only memslots didn't even exist at the time. Fixes: 4d8b81ab ("KVM: introduce readonly memslot") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> [Redone using kvm_vcpu_gfn_to_memslot_prot. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Use kvm_vcpu_gfn_to_hva() when retrieving the host page size so that the correct set of memslots is used when handling x86 page faults in SMM. Fixes: 54bf36aa ("KVM: x86: use vcpu-specific functions to read/write/translate GFNs") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a helper, is_transparent_hugepage(), to explicitly check whether a compound page is a THP and use it when populating KVM's secondary MMU. The explicit check fixes a bug where a remapped compound page, e.g. for an XDP Rx socket, is mapped into a KVM guest and is mistaken for a THP, which results in KVM incorrectly creating a huge page in its secondary MMU. Fixes: 936a5fe6 ("thp: kvm mmu transparent hugepage support") Reported-by: syzbot+c9d1fb51ac9d0d10c39d@syzkaller.appspotmail.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Check the result of __kvm_gfn_to_hva_cache_init() and return immediately instead of relying on the kvm_is_error_hva() check to detect errors so that it's abundantly clear KVM intends to immediately bail on an error. Note, the hva check is still mandatory to handle errors on subqeuesnt calls with the same generation. Similarly, always return -EFAULT on error so that multiple (bad) calls for a given generation will get the same result, e.g. on an illegal gfn wrap, propagating the return from __kvm_gfn_to_hva_cache_init() would cause the initial call to return -EINVAL and subsequent calls to return -EFAULT. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Barret reported a (technically benign) bug where nr_pages_avail can be accessed without being initialized if gfn_to_hva_many() fails. virt/kvm/kvm_main.c:2193:13: warning: 'nr_pages_avail' may be used uninitialized in this function [-Wmaybe-uninitialized] Rather than simply squashing the warning by initializing nr_pages_avail, fix the underlying issues by reworking __kvm_gfn_to_hva_cache_init() to return immediately instead of continuing on. Now that all callers check the result and/or bail immediately on a bad hva, there's no need to explicitly nullify the memslot on error. Reported-by: NBarret Rhoden <brho@google.com> Fixes: f1b9dd5e ("kvm: Disallow wraparound in kvm_gfn_to_hva_cache_init") Cc: Jim Mattson <jmattson@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
When reading/writing using the guest/host cache, check for a bad hva before checking for a NULL memslot, which triggers the slow path for handing cross-page accesses. Because the memslot is nullified on error by __kvm_gfn_to_hva_cache_init(), if the bad hva is encountered after crossing into a new page, then the kvm_{read,write}_guest() slow path could potentially write/access the first chunk prior to detecting the bad hva. Arguably, performing a partial access is semantically correct from an architectural perspective, but that behavior is certainly not intended. In the original implementation, memslot was not explicitly nullified and therefore the partial access behavior varied based on whether the memslot itself was null, or if the hva was simply bad. The current behavior was introduced as a seemingly unintentional side effect in commit f1b9dd5e ("kvm: Disallow wraparound in kvm_gfn_to_hva_cache_init"), which justified the change with "since some callers don't check the return code from this function, it sit seems prudent to clear ghc->memslot in the event of an error". Regardless of intent, the partial access is dependent on _not_ checking the result of the cache initialization, which is arguably a bug in its own right, at best simply weird. Fixes: 8f964525 ("KVM: Allow cross page reads and writes from cached translations.") Cc: Jim Mattson <jmattson@google.com> Cc: Andrew Honig <ahonig@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
For ring-based dirty log tracking, it will be more efficient to account writes during schedule-out or schedule-in to the currently running VCPU. We would like to do it even if the write doesn't use the current VCPU's address space, as is the case for cached writes (see commit 4e335d9e, "Revert "KVM: Support vCPU-based gfn->hva cache"", 2017-05-02). Therefore, add a mechanism to track the currently-loaded kvm_vcpu struct. There is already something similar in KVM/ARM; one important difference is that kvm_arch_vcpu_{load,put} have two callers in virt/kvm/kvm_main.c: we have to update both the architecture-independent vcpu_{load,put} and the preempt notifiers. Another change made in the process is to allow using kvm_get_running_vcpu() in preemptible code. This is allowed because preempt notifiers ensure that the value does not change even after the VCPU thread is migrated. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Peter Xu 提交于
It's already going to reach 2400 Bytes (which is over half of page size on 4K page archs), so maybe it's good to have this build-time check in case it overflows when adding new fields. Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Peter Xu 提交于
Remove kvm_read_guest_atomic() because it's not used anywhere. Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Open code the allocation and freeing of the vcpu->run page in kvm_vm_ioctl_create_vcpu() and kvm_vcpu_destroy() respectively. Doing so allows kvm_vcpu_init() to be a pure init function and eliminates kvm_vcpu_uninit() entirely. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the putting of vcpu->pid to kvm_vcpu_destroy(). vcpu->pid is guaranteed to be NULL when kvm_vcpu_uninit() is called in the error path of kvm_vm_ioctl_create_vcpu(), e.g. it is explicitly nullified by kvm_vcpu_init() and is only changed by KVM_RUN. No functional change intended. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all arch specific implementations are nops. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove kvm_arch_vcpu_setup() now that all arch specific implementations are nops. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Initialize the preempt notifier immediately in kvm_vcpu_init() to pave the way for removing kvm_arch_vcpu_setup(), i.e. to allow arch specific code to call vcpu_load() during kvm_arch_vcpu_create(). Back when preemption support was added, the location of the call to init the preempt notifier was perfectly sane. The overall vCPU creation flow featured a single arch specific hook and the preempt notifer was used immediately after its initialization (by vcpu_load()). E.g.: vcpu = kvm_arch_ops->vcpu_create(kvm, n); if (IS_ERR(vcpu)) return PTR_ERR(vcpu); preempt_notifier_init(&vcpu->preempt_notifier, &kvm_preempt_ops); vcpu_load(vcpu); r = kvm_mmu_setup(vcpu); vcpu_put(vcpu); if (r < 0) goto free_vcpu; Today, the call to preempt_notifier_init() is sandwiched between two arch specific calls, kvm_arch_vcpu_create() and kvm_arch_vcpu_setup(), which needlessly forces x86 (and possibly others?) to split its vCPU creation flow. Init the preempt notifier prior to any arch specific call so that each arch can independently decide how best to organize its creation flow. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Unexport kvm_vcpu_cache and kvm_vcpu_{un}init() and make them static now that they are referenced only in kvm_main.c. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Now that all architectures tightly couple vcpu allocation/free with the mandatory calls to kvm_{un}init_vcpu(), move the sequences verbatim to common KVM code. Move both allocation and initialization in a single patch to eliminate thrash in arch specific code. The bisection benefits of moving the two pieces in separate patches is marginal at best, whereas the odds of introducing a transient arch specific bug are non-zero. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 1月, 2020 2 次提交
-
-
由 Sean Christopherson 提交于
Add kvm_vcpu_destroy() and wire up all architectures to call the common function instead of their arch specific implementation. The common destruction function will be used by future patches to move allocation and initialization of vCPUs to common KVM code, i.e. to free resources that are allocated by arch agnostic code. No functional change intended. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a pre-allocation arch hook to handle checks that are currently done by arch specific code prior to allocating the vCPU object. This paves the way for moving the allocation to common KVM code. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 21 1月, 2020 4 次提交
-
-
由 Milan Pandurov 提交于
We can store reference to kvm_stats_debugfs_item instead of copying its values to kvm_stat_data. This allows us to remove duplicated code and usage of temporary kvm_stat_data inside vm_stat_get et al. Signed-off-by: NMilan Pandurov <milanpa@amazon.de> Reviewed-by: NAlexander Graf <graf@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Miaohe Lin 提交于
Fix some writing mistakes in the comments. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Miaohe Lin 提交于
Fix some grammar mistakes in the comments. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Miaohe Lin 提交于
Fix some wrong function names in comment. mmu_check_roots is a typo for mmu_check_root, vmcs_read_any should be vmcs12_read_any and so on. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 1月, 2020 1 次提交
-
-
由 Miaohe Lin 提交于
We can get rid of unnecessary var page in kvm_set_pfn_dirty() , thus make code style similar with kvm_set_pfn_accessed(). Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 23 11月, 2019 1 次提交
-
-
由 Miaohe Lin 提交于
The jump label out_free_1 and out_free_2 deal with the same stuff, so git rid of one and rename the label out_free_0a to retain the label name order. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 11月, 2019 2 次提交
-
-
由 Radim Krčmář 提交于
Fetching an index for any vcpu in kvm->vcpus array by traversing the entire array everytime is costly. This patch remembers the position of each vcpu in kvm->vcpus array by storing it in vcpus_idx under kvm_vcpu structure. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NNitesh Narayan Lal <nitesh@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marc Zyngier 提交于
Add a comment explaining the rational behind having both no_compat open and ioctl callbacks to fend off compat tasks. Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 14 11月, 2019 1 次提交
-
-
由 Marc Zyngier 提交于
On a system without KVM_COMPAT, we prevent IOCTLs from being issued by a compat task. Although this prevents most silly things from happening, it can still confuse a 32bit userspace that is able to open the kvm device (the qemu test suite seems to be pretty mad with this behaviour). Take a more radical approach and return a -ENODEV to the compat task. Reported-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 11月, 2019 1 次提交
-
-
由 Sean Christopherson 提交于
Explicitly exempt ZONE_DEVICE pages from kvm_is_reserved_pfn() and instead manually handle ZONE_DEVICE on a case-by-case basis. For things like page refcounts, KVM needs to treat ZONE_DEVICE pages like normal pages, e.g. put pages grabbed via gup(). But for flows such as setting A/D bits or shifting refcounts for transparent huge pages, KVM needs to to avoid processing ZONE_DEVICE pages as the flows in question lack the underlying machinery for proper handling of ZONE_DEVICE pages. This fixes a hang reported by Adam Borowski[*] in dev_pagemap_cleanup() when running a KVM guest backed with /dev/dax memory, as KVM straight up doesn't put any references to ZONE_DEVICE pages acquired by gup(). Note, Dan Williams proposed an alternative solution of doing put_page() on ZONE_DEVICE pages immediately after gup() in order to simplify the auditing needed to ensure is_zone_device_page() is called if and only if the backing device is pinned (via gup()). But that approach would break kvm_vcpu_{un}map() as KVM requires the page to be pinned from map() 'til unmap() when accessing guest memory, unlike KVM's secondary MMU, which coordinates with mmu_notifier invalidations to avoid creating stale page references, i.e. doesn't rely on pages being pinned. [*] http://lkml.kernel.org/r/20190919115547.GA17963@angband.plReported-by: NAdam Borowski <kilobyte@angband.pl> Analyzed-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NDan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org Fixes: 3565fce3 ("mm, x86: get_user_pages() for dax mappings") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 11月, 2019 1 次提交
-
-
由 Paolo Bonzini 提交于
Reported by syzkaller: ============================= WARNING: suspicious RCU usage ----------------------------- ./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by repro_11/12688. stack backtrace: Call Trace: dump_stack+0x7d/0xc5 lockdep_rcu_suspicious+0x123/0x170 kvm_dev_ioctl+0x9a9/0x1260 [kvm] do_vfs_ioctl+0x1a1/0xfb0 ksys_ioctl+0x6d/0x80 __x64_sys_ioctl+0x73/0xb0 do_syscall_64+0x108/0xaa0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Commit a97b0e77 (kvm: call kvm_arch_destroy_vm if vm creation fails) sets users_count to 1 before kvm_arch_init_vm(), however, if kvm_arch_init_vm() fails, we need to decrease this count. By moving it earlier, we can push the decrease to out_err_no_arch_destroy_vm without introducing yet another error label. syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=15209b84e00000 Reported-by: syzbot+75475908cd0910f141ee@syzkaller.appspotmail.com Fixes: a97b0e77 ("kvm: call kvm_arch_destroy_vm if vm creation fails") Cc: Jim Mattson <jmattson@google.com> Analyzed-by: NWanpeng Li <wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-