- 09 3月, 2021 1 次提交
-
-
由 Tom Lendacky 提交于
An SEV guest requires that virtio devices use the DMA API to allow the hypervisor to successfully access guest memory as needed. The VIRTIO_F_VERSION_1 and VIRTIO_F_ACCESS_PLATFORM features tell virtio to use the DMA API. Add arch_has_restricted_virtio_memory_access() for x86, to fail the device probe if these features have not been set for the device when running as an SEV guest. [ bp: Fix -Wmissing-prototypes warning Reported-by: kernel test robot <lkp@intel.com> ] Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/b46e0211f77ca1831f11132f969d470a6ffc9267.1614897610.git.thomas.lendacky@amd.com
-
- 06 3月, 2021 1 次提交
-
-
由 Borislav Petkov 提交于
vc_decode_insn() calls copy_from_kernel_nofault() by way of vc_fetch_insn_kernel() to fetch 15 bytes max of opcodes to decode. copy_from_kernel_nofault() returns negative on error and 0 on success. The error case is handled by returning ES_EXCEPTION. In the success case, the ret variable which contains the return value is 0 so there's no need to subtract it from MAX_INSN_SIZE when initializing the insn buffer for further decoding. Remove it. No functional changes. Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NJoerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20210223111130.16201-1-bp@alien8.de
-
- 03 3月, 2021 6 次提交
-
-
由 Juergen Gross 提交于
Since commit 9e2369c0 ("xen: add helpers to allocate unpopulated memory") foreign mappings are using guest physical addresses allocated via ZONE_DEVICE functionality. This will result in problems for the case of no balloon memory hotplug being configured, as the p2m list will only cover the initial memory size of the domain. Any ZONE_DEVICE allocated address will be outside the p2m range and thus a mapping can't be established with that memory address. Fix that by extending the p2m size for that case. At the same time add a check for a to be created mapping to be within the p2m limits in order to detect errors early. While changing a comment, remove some 32-bit leftovers. This is XSA-369. Fixes: 9e2369c0 ("xen: add helpers to allocate unpopulated memory") Cc: <stable@vger.kernel.org> # 5.9 Reported-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NJuergen Gross <jgross@suse.com>
-
由 Jan Beulich 提交于
Bailing immediately from set_foreign_p2m_mapping() upon a p2m updating error leaves the full batch in an ambiguous state as far as the caller is concerned. Instead flags respective slots as bad, unmapping what was mapped there right away. HYPERVISOR_grant_table_op()'s return value and the individual unmap slots' status fields get used only for a one-time - there's not much we can do in case of a failure. Note that there's no GNTST_enomem or alike, so GNTST_general_error gets used. The map ops' handle fields get overwritten just to be on the safe side. This is part of XSA-367. Cc: <stable@vger.kernel.org> Signed-off-by: NJan Beulich <jbeulich@suse.com> Reviewed-by: NJuergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/96cccf5d-e756-5f53-b91a-ea269bfb9be0@suse.comSigned-off-by: NJuergen Gross <jgross@suse.com>
-
由 Babu Moger 提交于
This problem was reported on a SVM guest while executing kexec. Kexec fails to load the new kernel when the PCID feature is enabled. When kexec starts loading the new kernel, it starts the process by resetting the vCPU's and then bringing each vCPU online one by one. The vCPU reset is supposed to reset all the register states before the vCPUs are brought online. However, the CR4 register is not reset during this process. If this register is already setup during the last boot, all the flags can remain intact. The X86_CR4_PCIDE bit can only be enabled in long mode. So, it must be enabled much later in SMP initialization. Having the X86_CR4_PCIDE bit set during SMP boot can cause a boot failures. Fix the issue by resetting the CR4 register in init_vmcb(). Signed-off-by: NBabu Moger <babu.moger@amd.com> Message-Id: <161471109108.30811.6392805173629704166.stgit@bmoger-ubuntu> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Woodhouse 提交于
This is how Xen guests do steal time accounting. The hypervisor records the amount of time spent in each of running/runnable/blocked/offline states. In the Xen accounting, a vCPU is still in state RUNSTATE_running while in Xen for a hypercall or I/O trap, etc. Only if Xen explicitly schedules does the state become RUNSTATE_blocked. In KVM this means that even when the vCPU exits the kvm_run loop, the state remains RUNSTATE_running. The VMM can explicitly set the vCPU to RUNSTATE_blocked by using the KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT attribute, and can also use KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST to retrospectively add a given amount of time to the blocked state and subtract it from the running state. The state_entry_time corresponds to get_kvmclock_ns() at the time the vCPU entered the current state, and the total times of all four states should always add up to state_entry_time. Co-developed-by: NJoao Martins <joao.m.martins@oracle.com> Signed-off-by: NJoao Martins <joao.m.martins@oracle.com> Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk> Message-Id: <20210301125309.874953-2-dwmw2@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Woodhouse 提交于
When clearing the per-vCPU shared regions, set the return value to zero to indicate success. This was causing spurious errors to be returned to userspace on soft reset. Also add a paranoid BUILD_BUG_ON() for compat structure compatibility. Fixes: 0c165b3c ("KVM: x86/xen: Allow reset of Xen attributes") Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk> Message-Id: <20210301125309.874953-1-dwmw2@infradead.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The Xen hypercall interface adds to the attack surface of the hypervisor and will be used quite rarely. Allow compiling it out. Suggested-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDavid Woodhouse <dwmw@amazon.co.uk> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 2月, 2021 4 次提交
-
-
由 NeilBrown 提交于
The memtype seq_file iterator allocates a buffer in the ->start and ->next functions and frees it in the ->show function. The preferred handling for such resources is to free them in the subsequent ->next or ->stop function call. Since Commit 1f4aace6 ("fs/seq_file.c: simplify seq_file iteration code and interface") there is no guarantee that ->show will be called after ->next, so this function can now leak memory. So move the freeing of the buffer to ->next and ->stop. Link: https://lkml.kernel.org/r/161248539022.21478.13874455485854739066.stgit@noble1 Fixes: 1f4aace6 ("fs/seq_file.c: simplify seq_file iteration code and interface") Signed-off-by: NNeilBrown <neilb@suse.de> Cc: Xin Long <lucien.xin@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Marco Elver 提交于
Add KFENCE test suite, testing various error detection scenarios. Makes use of KUnit for test organization. Since KFENCE's interface to obtain error reports is via the console, the test verifies that KFENCE outputs expected reports to the console. [elver@google.com: fix typo in test] Link: https://lkml.kernel.org/r/X9lHQExmHGvETxY4@elver.google.com [elver@google.com: show access type in report] Link: https://lkml.kernel.org/r/20210111091544.3287013-2-elver@google.com Link: https://lkml.kernel.org/r/20201103175841.3495947-9-elver@google.comSigned-off-by: NAlexander Potapenko <glider@google.com> Signed-off-by: NMarco Elver <elver@google.com> Reviewed-by: NDmitry Vyukov <dvyukov@google.com> Co-developed-by: NAlexander Potapenko <glider@google.com> Reviewed-by: NJann Horn <jannh@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hillf Danton <hdanton@sina.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joern Engel <joern@purestorage.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: SeongJae Park <sjpark@amazon.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Marco Elver 提交于
Instead of removing the fault handling portion of the stack trace based on the fault handler's name, just use struct pt_regs directly. Change kfence_handle_page_fault() to take a struct pt_regs, and plumb it through to kfence_report_error() for out-of-bounds, use-after-free, or invalid access errors, where pt_regs is used to generate the stack trace. If the kernel is a DEBUG_KERNEL, also show registers for more information. Link: https://lkml.kernel.org/r/20201105092133.2075331-1-elver@google.comSigned-off-by: NMarco Elver <elver@google.com> Suggested-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexander Potapenko 提交于
Add architecture specific implementation details for KFENCE and enable KFENCE for the x86 architecture. In particular, this implements the required interface in <asm/kfence.h> for setting up the pool and providing helper functions for protecting and unprotecting pages. For x86, we need to ensure that the pool uses 4K pages, which is done using the set_memory_4k() helper function. [elver@google.com: add missing copyright and description header] Link: https://lkml.kernel.org/r/20210118092159.145934-2-elver@google.com Link: https://lkml.kernel.org/r/20201103175841.3495947-3-elver@google.comSigned-off-by: NMarco Elver <elver@google.com> Signed-off-by: NAlexander Potapenko <glider@google.com> Reviewed-by: NDmitry Vyukov <dvyukov@google.com> Co-developed-by: NMarco Elver <elver@google.com> Reviewed-by: NJann Horn <jannh@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hillf Danton <hdanton@sina.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joern Engel <joern@purestorage.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: SeongJae Park <sjpark@amazon.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 2月, 2021 4 次提交
-
-
由 Paolo Bonzini 提交于
A missing flush would cause the static branch to trigger incorrectly. Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Check that PML is actually enabled before setting the mask to force a SPTE to be write-protected. The bits used for the !AD_ENABLED case are in the upper half of the SPTE. With 64-bit paging and EPT, these bits are ignored, but with 32-bit PAE paging they are reserved. Setting them for L2 SPTEs without checking PML breaks NPT on 32-bit KVM. Fixes: 1f4e5fc8 ("KVM: x86: fix nested guest live migration with PML") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210225204749.1512652-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Reported by syzkaller: KASAN: null-ptr-deref in range [0x0000000000000140-0x0000000000000147] CPU: 1 PID: 8370 Comm: syz-executor859 Not tainted 5.11.0-syzkaller #0 RIP: 0010:synic_get arch/x86/kvm/hyperv.c:165 [inline] RIP: 0010:kvm_hv_set_sint_gsi arch/x86/kvm/hyperv.c:475 [inline] RIP: 0010:kvm_hv_irq_routing_update+0x230/0x460 arch/x86/kvm/hyperv.c:498 Call Trace: kvm_set_irq_routing+0x69b/0x940 arch/x86/kvm/../../../virt/kvm/irqchip.c:223 kvm_vm_ioctl+0x12d0/0x2800 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3959 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Hyper-V context is lazily allocated until Hyper-V specific MSRs are accessed or SynIC is enabled. However, the syzkaller testcase sets irq routing table directly w/o enabling SynIC. This results in null-ptr-deref when accessing SynIC Hyper-V context. This patch fixes it. syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=163342ccd00000 Reported-by: syzbot+6987f3b2dbd9eda95f12@syzkaller.appspotmail.com Fixes: 8f014550 ("KVM: x86: hyper-v: Make Hyper-V emulation enablement conditional") Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1614326399-5762-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Dongli Zhang 提交于
The 'mmu_page_hash' is used as hash table while 'active_mmu_pages' is a list. Remove the misplaced comment as it's mostly stating the obvious anyways. Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210226061945.1222-1-dongli.zhang@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 25 2月, 2021 1 次提交
-
-
由 Sean Christopherson 提交于
Fix the interpreation of nested_svm_vmexit()'s return value when synthesizing a nested VM-Exit after intercepting an SVM instruction while L2 was running. The helper returns '0' on success, whereas a return value of '0' in the exit handler path means "exit to userspace". The incorrect return value causes KVM to exit to userspace without filling the run state, e.g. QEMU logs "KVM: unknown exit, hardware reason 0". Fixes: 14c2bf81 ("KVM: SVM: Fix #GP handling for doubly-nested virtualization") Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210224005627.657028-1-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 2月, 2021 5 次提交
-
-
由 Sami Tolvanen 提交于
Pass code model and stack alignment to the linker as these are not stored in LLVM bitcode, and allow CONFIG_LTO_CLANG* to be enabled. Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Reviewed-by: NKees Cook <keescook@chromium.org>
-
由 Sami Tolvanen 提交于
Clang incorrectly inlines functions with differing stack protector attributes, which breaks __restore_processor_state() that relies on stack protector being disabled. This change disables LTO for cpu.c to work aroung the bug. Link: https://bugs.llvm.org/show_bug.cgi?id=47479Suggested-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NSami Tolvanen <samitolvanen@google.com>
-
由 Sami Tolvanen 提交于
Disable LTO for the vDSO. Note that while we could use Clang's LTO for the 64-bit vDSO, it won't add noticeable benefit for the small amount of C code. Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Reviewed-by: NKees Cook <keescook@chromium.org>
-
由 Sami Tolvanen 提交于
Select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION is selected to use objtool to generate __mcount_loc sections for dynamic ftrace with Clang and gcc <5 (later versions of gcc use -mrecord-mcount). Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Reviewed-by: NKees Cook <keescook@chromium.org>
-
由 Like Xu 提交于
If lbr_desc->event is successfully created, the intel_pmu_create_ guest_lbr_event() will return 0, otherwise it will return -ENOENT, and then jump to LBR msrs dummy handling. Fixes: 1b5ac322 ("KVM: vmx/pmu: Pass-through LBR msrs when the guest LBR event is ACTIVE") Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210223013958.1280444-1-like.xu@linux.intel.com> [Add "< 0" and PTR_ERR to make the code clearer. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 23 2月, 2021 3 次提交
-
-
由 David Stevens 提交于
Track the range being invalidated by mmu_notifier and skip page fault retries if the fault address is not affected by the in-progress invalidation. Handle concurrent invalidations by finding the minimal range which includes all ranges being invalidated. Although the combined range may include unrelated addresses and cannot be shrunk as individual invalidation operations complete, it is unlikely the marginal gains of proper range tracking are worth the additional complexity. The primary benefit of this change is the reduction in the likelihood of extreme latency when handing a page fault due to another thread having been preempted while modifying host virtual addresses. Signed-off-by: NDavid Stevens <stevensd@chromium.org> Message-Id: <20210222024522.1751719-3-stevensd@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Don't retry a page fault due to an mmu_notifier invalidation when handling a page fault for a GPA that did not resolve to a memslot, i.e. an MMIO page fault. Invalidations from the mmu_notifier signal a change in a host virtual address (HVA) mapping; without a memslot, there is no HVA and thus no possibility that the invalidation is relevant to the page fault being handled. Note, the MMIO vs. memslot generation checks handle the case where a pending memslot will create a memslot overlapping the faulting GPA. The mmu_notifier checks are orthogonal to memslot updates. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210222024522.1751719-2-stevensd@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Right now, enter_svm_guest_mode is calling nested_prepare_vmcb_save and nested_prepare_vmcb_control. This results in is_guest_mode being false until the end of nested_prepare_vmcb_control. This is a problem because nested_prepare_vmcb_save can in turn cause changes to the intercepts and these have to be applied to the "host VMCB" (stored in svm->nested.hsave) and then merged with the VMCB12 intercepts into svm->vmcb. In particular, without this change we forget to set the CR0 read and CR0 write intercepts when running a real mode L2 guest with NPT disabled. The guest is therefore able to see the CR0.PG bit that KVM sets to enable "paged real mode". This patch fixes the svm.flat mode_switch test case with npt=0. There are no other problematic calls in nested_prepare_vmcb_save. Moving is_guest_mode to the end is done since commit 06fc7772 ("KVM: SVM: Activate nested state only when guest state is complete", 2010-04-25). However, back then KVM didn't grab a different VMCB when updating the intercepts, it had already copied/merged L1's stuff to L0's VMCB, and then updated L0's VMCB regardless of is_nested(). Later recalc_intercepts was introduced in commit 384c6368 ("KVM: SVM: Add function to recalculate intercept masks", 2011-01-12). This introduced the bug, because recalc_intercepts now throws away the intercept manipulations that svm_set_cr0 had done in the meanwhile to svm->vmcb. [1] https://lore.kernel.org/kvm/1266493115-28386-1-git-send-email-joerg.roedel@amd.com/Reviewed-by: NSean Christopherson <seanjc@google.com> Tested-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 22 2月, 2021 3 次提交
-
-
由 Jens Axboe 提交于
PF_IO_WORKER are kernel threads too, but they aren't PF_KTHREAD in the sense that we don't assign ->set_child_tid with our own structure. Just ensure that every arch sets up the PF_IO_WORKER threads like kthreads in the arch implementation of copy_thread(). Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Masahiro Yamada 提交于
The 'syscall' variables are not directly used in the commands. Remove the $(srctree)/ prefix because we can rely on VPATH. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
The rules in these Makefiles cannot detect the command line change because the prerequisite 'FORCE' is missing. Adding 'FORCE' will result in the headers being rebuilt every time because the 'targets' additions are also wrong; the file paths in 'targets' must be relative to the current Makefile. Fix all of them so the if_changed rules work correctly. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
- 19 2月, 2021 12 次提交
-
-
由 Sean Christopherson 提交于
Remove several exports from the MMU that are no longer necessary. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-15-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop kvm_mmu_slot_largepage_remove_write_access() and refactor its sole caller to use kvm_mmu_slot_remove_write_access(). Remove the now-unused slot_handle_large_level() and slot_handle_all_level() helpers. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-14-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Stop setting dirty bits for MMU pages when dirty logging is disabled for a memslot, as PML is now completely disabled when there are no memslots with dirty logging enabled. This means that spurious PML entries will be created for memslots with dirty logging disabled if at least one other memslot has dirty logging enabled. However, spurious PML entries are already possible since dirty bits are set only when a dirty logging is turned off, i.e. memslots that are never dirty logged will have dirty bits cleared. In the end, it's faster overall to eat a few spurious PML entries in the window where dirty logging is being disabled across all memslots. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-13-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Makarand Sonare 提交于
Currently, if enable_pml=1 PML remains enabled for the entire lifetime of the VM irrespective of whether dirty logging is enable or disabled. When dirty logging is disabled, all the pages of the VM are manually marked dirty, so that PML is effectively non-operational. Setting the dirty bits is an expensive operation which can cause severe MMU lock contention in a performance sensitive path when dirty logging is disabled after a failed or canceled live migration. Manually setting dirty bits also fails to prevent PML activity if some code path clears dirty bits, which can incur unnecessary VM-Exits. In order to avoid this extra overhead, dynamically enable/disable PML when dirty logging gets turned on/off for the first/last memslot. Signed-off-by: NMakarand Sonare <makarandsonare@google.com> Co-developed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-12-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a sanity check in kvm_mmu_slot_apply_flags to assert that the LOG_DIRTY_PAGES flag is indeed being toggled, and explicitly rely on that holding true when zapping collapsible SPTEs. Manipulating the CPU dirty log (PML) and write-protection also relies on this assertion, but that's not obvious in the current code. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-11-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop the facade of KVM's PML logic being vendor specific and move the bits that aren't truly VMX specific into common x86 code. The MMU logic for dealing with PML is tightly coupled to the feature and to VMX's implementation, bouncing through kvm_x86_ops obfuscates the code without providing any meaningful separation of concerns or encapsulation. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-10-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Store the vendor-specific dirty log size in a variable, there's no need to wrap it in a function since the value is constant after hardware_setup() runs. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-9-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Expand the comment about need to use write-protection for nested EPT when PML is enabled to clarify that the tagging is a nop when PML is _not_ enabled. Without the clarification, omitting the PML check looks wrong at first^Wfifth glance. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-8-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Unconditionally disable PML in vmcs02, KVM emulates PML purely in the MMU, e.g. vmx_flush_pml_buffer() doesn't even try to copy the L2 GPAs from vmcs02's buffer to vmcs12. At best, enabling PML is a nop. At worst, it will cause vmx_flush_pml_buffer() to record bogus GFNs in the dirty logs. Initialize vmcs02.GUEST_PML_INDEX such that PML writes would trigger VM-Exit if PML was somehow enabled, skip flushing the buffer for guest mode since the index is bogus, and freak out if a PML full exit occurs when L2 is active. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-7-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
When zapping SPTEs in order to rebuild them as huge pages, use the new helper that computes the max mapping level to detect whether or not a SPTE should be zapped. Doing so avoids zapping SPTEs that can't possibly be rebuilt as huge pages, e.g. due to hardware constraints, memslot alignment, etc... This also avoids zapping SPTEs that are still large, e.g. if migration was canceled before write-protected huge pages were shattered to enable dirty logging. Note, such pages are still write-protected at this time, i.e. a page fault VM-Exit will still occur. This will hopefully be addressed in a future patch. Sadly, TDP MMU loses its const on the memslot, but that's a pervasive problem that's been around for quite some time. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-6-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Pass the memslot to the rmap callbacks, it will be used when zapping collapsible SPTEs to verify the memslot is compatible with hugepages before zapping its SPTEs. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-5-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Factor out the logic for determining the maximum mapping level given a memslot and a gpa. The helper will be used when zapping collapsible SPTEs when disabling dirty logging, e.g. to avoid zapping SPTEs that can't possibly be rebuilt as hugepages. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210213005015.1651772-4-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-