- 21 4月, 2020 3 次提交
-
-
由 Tianjia Zhang 提交于
In earlier versions of kvm, 'kvm_run' was an independent structure and was not included in the vcpu structure. At present, 'kvm_run' is already included in the vcpu structure, so the parameter 'kvm_run' is redundant. This patch simplifies the function definition, removes the extra 'kvm_run' parameter, and extracts it from the 'kvm_vcpu' structure if necessary. Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com> Message-Id: <20200416051057.26526-1-tianjia.zhang@linux.alibaba.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Create a new function kvm_is_visible_memslot() and use it from kvm_is_visible_gfn(); use the new function in try_async_pf() too, to avoid an extra memslot lookup. Opportunistically squish a multi-line comment into a single-line comment. Note, the end result, KVM_PFN_NOSLOT, is unchanged. Cc: Jim Mattson <jmattson@google.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The placement of kvm_create_vcpu_debugfs is more or less irrelevant, since it cannot fail and userspace should not care about the debugfs entries until it knows the vcpu has been created. Moving it after the last failure point removes the need to remove the directory when unwinding the creation. Reviewed-by: NEmanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20200331224222.393439-1-pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 4月, 2020 1 次提交
-
-
由 Colin Ian King 提交于
The variable r is being assigned with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: NColin Ian King <colin.king@canonical.com> Message-Id: <20200410113526.13822-1-colin.king@canonical.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 31 3月, 2020 1 次提交
-
-
由 Sean Christopherson 提交于
Pass @opaque to kvm_arch_hardware_setup() and kvm_arch_check_processor_compat() to allow architecture specific code to reference @opaque without having to stash it away in a temporary global variable. This will enable x86 to separate its vendor specific callback ops, which are passed via @opaque, into "init" and "runtime" ops without having to stash away the "init" ops. No functional change intended. Reviewed-by: NCornelia Huck <cohuck@redhat.com> Tested-by: Cornelia Huck <cohuck@redhat.com> #s390 Acked-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321202603.19355-2-sean.j.christopherson@intel.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 3月, 2020 1 次提交
-
-
由 Sean Christopherson 提交于
Reset the LRU slot if it becomes invalid when deleting a memslot to fix an out-of-bounds/use-after-free access when searching through memslots. Explicitly check for there being no used slots in search_memslots(), and in the caller of s390's approximation variant. Fixes: 36947254 ("KVM: Dynamically size memslot array based on number of used slots") Reported-by: NQian Cai <cai@lca.pw> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320205546.2396-2-sean.j.christopherson@intel.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 3月, 2020 7 次提交
-
-
由 Marc Zyngier 提交于
The vgic-state debugfs file could do with showing the pending state of the HW-backed SGIs. Plug it into the low-level code. Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-24-maz@kernel.org
-
由 Marc Zyngier 提交于
Each time a Group-enable bit gets flipped, the state of these bits needs to be forwarded to the hardware. This is a pretty heavy handed operation, requiring all vcpus to reload their GICv4 configuration. It is thus implemented as a new request type. These enable bits are programmed into the HW by setting the VGrp{0,1}En fields of GICR_VPENDBASER when the vPEs are made resident again. Of course, we only support Group-1 for now... Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-22-maz@kernel.org
-
由 Marc Zyngier 提交于
The GICv4.1 architecture gives the hypervisor the option to let the guest choose whether it wants the good old SGIs with an active state, or the new, HW-based ones that do not have one. For this, plumb the configuration of SGIs into the GICv3 MMIO handling, present the GICD_TYPER2.nASSGIcap to the guest, and handle the GICD_CTLR.nASSGIreq setting. In order to be able to deal with the restore of a guest, also apply the GICD_CTLR.nASSGIreq setting at first run so that we can move the restored SGIs to the HW if that's what the guest had selected in a previous life. Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-21-maz@kernel.org
-
由 Marc Zyngier 提交于
In order to hide some of the differences between v4.0 and v4.1, move the doorbell management out of the KVM code, and into the GICv4-specific layer. This allows the calling code to ask for the doorbell when blocking, and otherwise to leave the doorbell permanently disabled. This matches the v4.1 code perfectly, and only results in a minor refactoring of the v4.0 code. Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-14-maz@kernel.org
-
由 Marc Zyngier 提交于
In order to let a guest buy in the new, active-less SGIs, we need to be able to switch between the two modes. Handle this by stopping all guest activity, transfer the state from one mode to the other, and resume the guest. Nothing calls this code so far, but a later patch will plug it into the MMIO emulation. Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-20-maz@kernel.org
-
由 Marc Zyngier 提交于
Most of the GICv3 emulation code that deals with SGIs now has to be aware of the v4.1 capabilities in order to benefit from it. Add such support, keyed on the interrupt having the hw flag set and being a SGI. Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-19-maz@kernel.org
-
由 Marc Zyngier 提交于
As GICv4.1 understands the life cycle of doorbells (instead of just randomly firing them at the most inconvenient time), just enable them at irq_request time, and be done with it. Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-18-maz@kernel.org
-
- 17 3月, 2020 19 次提交
-
-
由 Sean Christopherson 提交于
Drop largepages_enabled, kvm_largepages_enabled() and kvm_disable_largepages() now that all users are gone. Note, largepages_enabled was an x86-only flag that got left in common KVM code when KVM gained support for multiple architectures. No functional change intended. Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Peter Xu 提交于
It's never used anywhere now. Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jay Zhou 提交于
It could take kvm->mmu_lock for an extended period of time when enabling dirty log for the first time. The main cost is to clear all the D-bits of last level SPTEs. This situation can benefit from manual dirty log protect as well, which can reduce the mmu_lock time taken. The sequence is like this: 1. Initialize all the bits of the dirty bitmap to 1 when enabling dirty log for the first time 2. Only write protect the huge pages 3. KVM_GET_DIRTY_LOG returns the dirty bitmap info 4. KVM_CLEAR_DIRTY_LOG will clear D-bit for each of the leaf level SPTEs gradually in small chunks Under the Intel(R) Xeon(R) Gold 6152 CPU @ 2.10GHz environment, I did some tests with a 128G windows VM and counted the time taken of memory_global_dirty_log_start, here is the numbers: VM Size Before After optimization 128G 460ms 10ms Signed-off-by: NJay Zhou <jianjay.zhou@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Peter Xu 提交于
Remove includes of asm/kvm_host.h from files that already include linux/kvm_host.h to make it more obvious that there is no ordering issue between the two headers. linux/kvm_host.h includes asm/kvm_host.h to pick up architecture specific settings, and this will never change, i.e. including asm/kvm_host.h after linux/kvm_host.h may seem problematic, but in practice is simply redundant. Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Now that the memslot logic doesn't assume memslots are always non-NULL, dynamically size the array of memslots instead of unconditionally allocating memory for the maximum number of memslots. Note, because a to-be-deleted memslot must first be invalidated, the array size cannot be immediately reduced when deleting a memslot. However, consecutive deletions will realize the memory savings, i.e. a second deletion will trim the entry. Tested-by: NChristoffer Dall <christoffer.dall@arm.com> Tested-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Refactor memslot handling to treat the number of used slots as the de facto size of the memslot array, e.g. return NULL from id_to_memslot() when an invalid index is provided instead of relying on npages==0 to detect an invalid memslot. Rework the sorting and walking of memslots in advance of dynamically sizing memslots to aid bisection and debug, e.g. with luck, a bug in the refactoring will bisect here and/or hit a WARN instead of randomly corrupting memory. Alternatively, a global null/invalid memslot could be returned, i.e. so callers of id_to_memslot() don't have to explicitly check for a NULL memslot, but that approach runs the risk of introducing difficult-to- debug issues, e.g. if the global null slot is modified. Constifying the return from id_to_memslot() to combat such issues is possible, but would require a massive refactoring of arch specific code and would still be susceptible to casting shenanigans. Add function comments to update_memslots() and search_memslots() to explicitly (and loudly) state how memslots are sorted. Opportunistically stuff @hva with a non-canonical value when deleting a private memslot on x86 to detect bogus usage of the freed slot. No functional change intended. Tested-by: NChristoffer Dall <christoffer.dall@arm.com> Tested-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rework kvm_get_dirty_log() so that it "returns" the associated memslot on success. A future patch will rework memslot handling such that id_to_memslot() can return NULL, returning the memslot makes it more obvious that the validity of the memslot has been verified, i.e. precludes the need to add validity checks in the arch code that are technically unnecessary. To maintain ordering in s390, move the call to kvm_arch_sync_dirty_log() from s390's kvm_vm_ioctl_get_dirty_log() to the new kvm_get_dirty_log(). This is a nop for PPC, the only other arch that doesn't select KVM_GENERIC_DIRTYLOG_READ_PROTECT, as its sync_dirty_log() is empty. Ideally, moving the sync_dirty_log() call would be done in a separate patch, but it can't be done in a follow-on patch because that would temporarily break s390's ordering. Making the move in a preparatory patch would be functionally correct, but would create an odd scenario where the moved sync_dirty_log() would operate on a "different" memslot due to consuming the result of a different id_to_memslot(). The memslot couldn't actually be different as slots_lock is held, but the code is confusing enough as it is, i.e. moving sync_dirty_log() in this patch is the lesser of all evils. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the implementations of KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG for CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT into common KVM code. The arch specific implemenations are extremely similar, differing only in whether the dirty log needs to be sync'd from hardware (x86) and how the TLBs are flushed. Add new arch hooks to handle sync and TLB flush; the sync will also be used for non-generic dirty log support in a future patch (s390). The ulterior motive for providing a common implementation is to eliminate the dependency between arch and common code with respect to the memslot referenced by the dirty log, i.e. to make it obvious in the code that the validity of the memslot is guaranteed, as a future patch will rework memslot handling such that id_to_memslot() can return NULL. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Clean up __kvm_set_memory_region() to achieve several goals: - Remove local variables that serve no real purpose - Improve the readability of the code - Better show the relationship between the 'old' and 'new' memslot - Prepare for dynamically sizing memslots - Document subtle gotchas (via comments) Note, using 'tmp' to hold the initial memslot is not strictly necessary at this juncture, e.g. 'old' could be directly copied from id_to_memslot(), but keep the pointer usage as id_to_memslot() will be able to return a NULL pointer once memslots are dynamically sized. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Now that all callers of kvm_free_memslot() pass NULL for @dont, remove the param from the top-level routine and all arch's implementations. No functional change intended. Tested-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move memslot deletion into its own routine so that the success path for other memslot updates does not need to use kvm_free_memslot(), i.e. can explicitly destroy the dirty bitmap when necessary. This paves the way for dropping @dont from kvm_free_memslot(), i.e. all callers now pass NULL for @dont. Add a comment above the code to make a copy of the existing memslot prior to deletion, it is not at all obvious that the pointer will become stale during sorting and/or installation of new memslots. Note, kvm_arch_commit_memory_region() allows an architecture to free resources when moving a memslot or changing its flags, e.g. x86 frees its arch specific memslot metadata during commit_memory_region(). Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Tested-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop the "const" attribute from @old in kvm_arch_commit_memory_region() to allow arch specific code to free arch specific resources in the old memslot without having to cast away the attribute. Freeing resources in kvm_arch_commit_memory_region() paves the way for simplifying kvm_free_memslot() by eliminating the last usage of its @dont param. Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Split out the core functionality of setting a memslot into a separate helper in preparation for moving memslot deletion into its own routine. Tested-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Replace a big pile o' gotos with returns to make it more obvious what error code is being returned, and to prepare for refactoring the functional, i.e. post-checks, portion of __kvm_set_memory_region(). Reviewed-by: NJanosch Frank <frankja@linux.ibm.com> Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Explicitly free an allocated-but-unused dirty bitmap instead of relying on kvm_free_memslot() if an error occurs in __kvm_set_memory_region(). There is no longer a need to abuse kvm_free_memslot() to free arch specific resources as arch specific code is now called only after the common flow is guaranteed to succeed. Arch code can still fail, but it's responsible for its own cleanup in that case. Eliminating the error path's abuse of kvm_free_memslot() paves the way for simplifying kvm_free_memslot(), i.e. dropping its @dont param. Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove kvm_arch_create_memslot() now that all arch implementations are effectively nops. Removing kvm_arch_create_memslot() eliminates the possibility for arch specific code to allocate memory prior to setting a memslot, which sets the stage for simplifying kvm_free_memslot(). Cc: Janosch Frank <frankja@linux.ibm.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
The two implementations of kvm_arch_create_memslot() in x86 and PPC are both good citizens and free up all local resources if creation fails. Return immediately (via a superfluous goto) instead of calling kvm_free_memslot(). Note, the call to kvm_free_memslot() is effectively an expensive nop in this case as there are no resources to be freed. No functional change intended. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Reinstall the old memslots if preparing the new memory region fails after invalidating a to-be-{re}moved memslot. Remove the superfluous 'old_memslots' variable so that it's somewhat clear that the error handling path needs to free the unused memslots, not simply the 'old' memslots. Fixes: bc6678a3 ("KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update") Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 KarimAllah Ahmed 提交于
Use the physical timer structure when reading the physical counter instead of using the virtual timer structure. Thankfully, nothing is accessing this code path yet (at least not until we enable save/restore of the physical counter). It doesn't hurt for this to be correct though. Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de> [maz: amended commit log] Signed-off-by: NMarc Zyngier <maz@kernel.org> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Fixes: 84135d3d ("KVM: arm/arm64: consolidate arch timer trap handlers") Link: https://lore.kernel.org/r/1584351546-5018-1-git-send-email-karahmed@amazon.de
-
- 17 2月, 2020 1 次提交
-
-
由 Mark Rutland 提交于
With VHE, running a vCPU always requires the sequence: 1. kvm_arm_vhe_guest_enter(); 2. kvm_vcpu_run_vhe(); 3. kvm_arm_vhe_guest_exit() ... and as we invoke this from the shared arm/arm64 KVM code, 32-bit arm has to provide stubs for all three functions. To simplify the common code, and make it easier to make further modifications to the arm64-specific portions in the near future, let's fold kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() into kvm_vcpu_run_vhe(). The 32-bit stubs for kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() are removed, as they are no longer used. The 32-bit stub for kvm_vcpu_run_vhe() is left as-is. There should be no functional change as a result of this patch. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200210114757.2889-1-mark.rutland@arm.com
-
- 12 2月, 2020 1 次提交
-
-
由 Marc Zyngier 提交于
Accessing a per-cpu variable only makes sense when preemption is disabled (and the kernel does check this when the right debug options are switched on). For kvm_get_running_vcpu(), it is fine to return the value after re-enabling preemption, as the preempt notifiers will make sure that this is kept consistent across task migration (the comment above the function hints at it, but lacks the crucial preemption management). While we're at it, move the comment from the ARM code, which explains why the whole thing works. Fixes: 7495e22b ("KVM: Move running VCPU from ARM to common code"). Cc: Paolo Bonzini <pbonzini@redhat.com> Reported-by: NZenghui Yu <yuzenghui@huawei.com> Tested-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/318984f6-bc36-33a3-abc6-bf2295974b06@huawei.com Message-id: <20200207163410.31276-1-maz@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 2月, 2020 2 次提交
-
-
由 Zhuang Yanying 提交于
We are testing Virtual Machine with KSM on v5.4-rc2 kernel, and found the zero_page refcount overflow. The cause of refcount overflow is increased in try_async_pf (get_user_page) without being decreased in mmu_set_spte() while handling ept violation. In kvm_release_pfn_clean(), only unreserved page will call put_page. However, zero page is reserved. So, as well as creating and destroy vm, the refcount of zero page will continue to increase until it overflows. step1: echo 10000 > /sys/kernel/pages_to_scan/pages_to_scan echo 1 > /sys/kernel/pages_to_scan/run echo 1 > /sys/kernel/pages_to_scan/use_zero_pages step2: just create several normal qemu kvm vms. And destroy it after 10s. Repeat this action all the time. After a long period of time, all domains hang because of the refcount of zero page overflow. Qemu print error log as follow: … error: kvm run failed Bad address EAX=00006cdc EBX=00000008 ECX=80202001 EDX=078bfbfd ESI=ffffffff EDI=00000000 EBP=00000008 ESP=00006cc4 EIP=000efd75 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] FS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy GDT= 000f7070 00000037 IDT= 000f70ae 00000000 CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=00 01 00 00 00 e9 e8 00 00 00 c7 05 4c 55 0f 00 01 00 00 00 <8b> 35 00 00 01 00 8b 3d 04 00 01 00 b8 d8 d3 00 00 c1 e0 08 0c ea a3 00 00 01 00 c7 05 04 … Meanwhile, a kernel warning is departed. [40914.836375] WARNING: CPU: 3 PID: 82067 at ./include/linux/mm.h:987 try_get_page+0x1f/0x30 [40914.836412] CPU: 3 PID: 82067 Comm: CPU 0/KVM Kdump: loaded Tainted: G OE 5.2.0-rc2 #5 [40914.836415] RIP: 0010:try_get_page+0x1f/0x30 [40914.836417] Code: 40 00 c3 0f 1f 84 00 00 00 00 00 48 8b 47 08 a8 01 75 11 8b 47 34 85 c0 7e 10 f0 ff 47 34 b8 01 00 00 00 c3 48 8d 78 ff eb e9 <0f> 0b 31 c0 c3 66 90 66 2e 0f 1f 84 00 0 0 00 00 00 48 8b 47 08 a8 [40914.836418] RSP: 0018:ffffb4144e523988 EFLAGS: 00010286 [40914.836419] RAX: 0000000080000000 RBX: 0000000000000326 RCX: 0000000000000000 [40914.836420] RDX: 0000000000000000 RSI: 00004ffdeba10000 RDI: ffffdf07093f6440 [40914.836421] RBP: ffffdf07093f6440 R08: 800000424fd91225 R09: 0000000000000000 [40914.836421] R10: ffff9eb41bfeebb8 R11: 0000000000000000 R12: ffffdf06bbd1e8a8 [40914.836422] R13: 0000000000000080 R14: 800000424fd91225 R15: ffffdf07093f6440 [40914.836423] FS: 00007fb60ffff700(0000) GS:ffff9eb4802c0000(0000) knlGS:0000000000000000 [40914.836425] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [40914.836426] CR2: 0000000000000000 CR3: 0000002f220e6002 CR4: 00000000003626e0 [40914.836427] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [40914.836427] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [40914.836428] Call Trace: [40914.836433] follow_page_pte+0x302/0x47b [40914.836437] __get_user_pages+0xf1/0x7d0 [40914.836441] ? irq_work_queue+0x9/0x70 [40914.836443] get_user_pages_unlocked+0x13f/0x1e0 [40914.836469] __gfn_to_pfn_memslot+0x10e/0x400 [kvm] [40914.836486] try_async_pf+0x87/0x240 [kvm] [40914.836503] tdp_page_fault+0x139/0x270 [kvm] [40914.836523] kvm_mmu_page_fault+0x76/0x5e0 [kvm] [40914.836588] vcpu_enter_guest+0xb45/0x1570 [kvm] [40914.836632] kvm_arch_vcpu_ioctl_run+0x35d/0x580 [kvm] [40914.836645] kvm_vcpu_ioctl+0x26e/0x5d0 [kvm] [40914.836650] do_vfs_ioctl+0xa9/0x620 [40914.836653] ksys_ioctl+0x60/0x90 [40914.836654] __x64_sys_ioctl+0x16/0x20 [40914.836658] do_syscall_64+0x5b/0x180 [40914.836664] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [40914.836666] RIP: 0033:0x7fb61cb6bfc7 Signed-off-by: NLinFeng <linfeng23@huawei.com> Signed-off-by: NZhuang Yanying <ann.zhuangyanying@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jeremy Cline 提交于
Fedora kernel builds on armv7hl began failing recently because kvm_arm_exception_type and kvm_arm_exception_class were undeclared in trace.h. Add the missing include. Fixes: 0e20f5e2 ("KVM: arm/arm64: Cleanup MMIO handling") Signed-off-by: NJeremy Cline <jcline@redhat.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200205134146.82678-1-jcline@redhat.com
-
- 31 1月, 2020 2 次提交
-
-
由 Boris Ostrovsky 提交于
__kvm_map_gfn()'s call to gfn_to_pfn_memslot() is * relatively expensive * in certain cases (such as when done from atomic context) cannot be called Stashing gfn-to-pfn mapping should help with both cases. This is part of CVE-2019-3016. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: NJoao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Boris Ostrovsky 提交于
kvm_vcpu_(un)map operates on gfns from any current address space. In certain cases we want to make sure we are not mapping SMRAM and for that we can use kvm_(un)map_gfn() that we are introducing in this patch. This is part of CVE-2019-3016. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: NJoao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 1月, 2020 2 次提交
-
-
由 Alexandru Elisei 提交于
According to the ARM ARM, registers CNT{P,V}_TVAL_EL0 have bits [63:32] RES0 [1]. When reading the register, the value is truncated to the least significant 32 bits [2], and on writes, TimerValue is treated as a signed 32-bit integer [1, 2]. When the guest behaves correctly and writes 32-bit values, treating TVAL as an unsigned 64 bit register works as expected. However, things start to break down when the guest writes larger values, because (u64)0x1_ffff_ffff = 8589934591. but (s32)0x1_ffff_ffff = -1, and the former will cause the timer interrupt to be asserted in the future, but the latter will cause it to be asserted now. Let's treat TVAL as a signed 32-bit register on writes, to match the behaviour described in the architecture, and the behaviour experimentally exhibited by the virtual timer on a non-vhe host. [1] Arm DDI 0487E.a, section D13.8.18 [2] Arm DDI 0487E.a, section D11.2.4 Signed-off-by: NAlexandru Elisei <alexandru.elisei@arm.com> [maz: replaced the read-side mask with lower_32_bits] Signed-off-by: NMarc Zyngier <maz@kernel.org> Fixes: 8fa76162 ("KVM: arm/arm64: arch_timer: Fix CNTP_TVAL calculation") Link: https://lore.kernel.org/r/20200127103652.2326-1-alexandru.elisei@arm.com
-
由 Eric Auger 提交于
Let the code never use unsupported event counters. Change kvm_pmu_handle_pmcr() to only reset supported counters and kvm_pmu_vcpu_reset() to only stop supported counters. Other actions are filtered on the supported counters in kvm/sysregs.c Signed-off-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200124142535.29386-5-eric.auger@redhat.com
-