- 28 9月, 2020 14 次提交
-
-
由 Paolo Bonzini 提交于
We are going to use it for SVM too, so use a more generic name. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alexander Graf 提交于
It's not desireable to have all MSRs always handled by KVM kernel space. Some MSRs would be useful to handle in user space to either emulate behavior (like uCode updates) or differentiate whether they are valid based on the CPU model. To allow user space to specify which MSRs it wants to see handled by KVM, this patch introduces a new ioctl to push filter rules with bitmaps into KVM. Based on these bitmaps, KVM can then decide whether to reject MSR access. With the addition of KVM_CAP_X86_USER_SPACE_MSR it can also deflect the denied MSR events to user space to operate on. If no filter is populated, MSR handling stays identical to before. Signed-off-by: NAlexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-8-graf@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alexander Graf 提交于
In the following commits we will add pieces of MSR filtering. To ensure that code compiles even with the feature half-merged, let's add a few stubs and struct definitions before the real patches start. Signed-off-by: NAlexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-4-graf@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alexander Graf 提交于
MSRs are weird. Some of them are normal control registers, such as EFER. Some however are registers that really are model specific, not very interesting to virtualization workloads, and not performance critical. Others again are really just windows into package configuration. Out of these MSRs, only the first category is necessary to implement in kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against certain CPU models and MSRs that contain information on the package level are much better suited for user space to process. However, over time we have accumulated a lot of MSRs that are not the first category, but still handled by in-kernel KVM code. This patch adds a generic interface to handle WRMSR and RDMSR from user space. With this, any future MSR that is part of the latter categories can be handled in user space. Furthermore, it allows us to replace the existing "ignore_msrs" logic with something that applies per-VM rather than on the full system. That way you can run productive VMs in parallel to experimental ones where you don't care about proper MSR handling. Signed-off-by: NAlexander Graf <graf@amazon.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <20200925143422.21718-3-graf@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename the "shared_msrs" mechanism, which is used to defer restoring MSRs that are only consumed when running in userspace, to a more banal but less likely to be confusing "user_return_msrs". The "shared" nomenclature is confusing as it's not obvious who is sharing what, e.g. reasonable interpretations are that the guest value is shared by vCPUs in a VM, or that the MSR value is shared/common to guest and host, both of which are wrong. "shared" is also misleading as the MSR value (in hardware) is not guaranteed to be shared/reused between VMs (if that's indeed the correct interpretation of the name), as the ability to share values between VMs is simply a side effect (albiet a very nice side effect) of deferring restoration of the host value until returning from userspace. "user_return" avoids the above confusion by describing the mechanism itself instead of its effects. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-2-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Extend the kvm_exit tracepoint to align it with kvm_nested_vmexit in terms of what information is captured. On SVM, add interrupt info and error code, while on VMX it add IDT vectoring and error code. This sets the stage for macrofying the kvm_exit tracepoint definition so that it can be reused for kvm_nested_vmexit without loss of information. Opportunistically stuff a zero for VM_EXIT_INTR_INFO if the VM-Enter failed, as the field is guaranteed to be invalid. Note, it'd be possible to further filter the interrupt/exception fields based on the VM-Exit reason, but the helper is intended only for tracepoints, i.e. an extra VMREAD or two is a non-issue, the failed VM-Enter case is just low hanging fruit. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923201349.16097-5-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename SECONDARY_EXEC_RDTSCP to SECONDARY_EXEC_ENABLE_RDTSCP in preparation for consolidating the logic for adjusting secondary exec controls based on the guest CPUID model. No functional change intended. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923165048.20486-4-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Replace the existing kvm_x86_ops.need_emulation_on_page_fault() with a more generic is_emulatable(), and unconditionally call the new function in x86_emulate_instruction(). KVM will use the generic hook to support multiple security related technologies that prevent emulation in one way or another. Similar to the existing AMD #NPF case where emulation of the current instruction is not possible due to lack of information, AMD's SEV-ES and Intel's SGX and TDX will introduce scenarios where emulation is impossible due to the guest's register state being inaccessible. And again similar to the existing #NPF case, emulation can be initiated by kvm_mmu_page_fault(), i.e. outside of the control of vendor-specific code. While the cause and architecturally visible behavior of the various cases are different, e.g. SGX will inject a #UD, AMD #NPF is a clean resume or complete shutdown, and SEV-ES and TDX "return" an error, the impact on the common emulation code is identical: KVM must stop emulation immediately and resume the guest. Query is_emulatable() in handle_ud() as well so that the force_emulation_prefix code doesn't incorrectly modify RIP before calling emulate_instruction() in the absurdly unlikely scenario that KVM encounters forced emulation in conjunction with "do not emulate". Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200915232702.15945-1-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Babu Moger 提交于
The new intercept bits have been added in vmcb control area to support few more interceptions. Here are the some of them. - INTERCEPT_INVLPGB, - INTERCEPT_INVLPGB_ILLEGAL, - INTERCEPT_INVPCID, - INTERCEPT_MCOMMIT, - INTERCEPT_TLBSYNC, Add a new intercept word in vmcb_control_area to support these instructions. Also update kvm_nested_vmrun trace function to support the new addition. AMD documentation for these instructions is available at "AMD64 Architecture Programmer’s Manual Volume 2: System Programming, Pub. 24593 Rev. 3.34(or later)" The documentation can be obtained at the links below: Link: https://www.amd.com/system/files/TechDocs/24593.pdf Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537Signed-off-by: NBabu Moger <babu.moger@amd.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <159985251547.11252.16994139329949066945.stgit@bmoger-ubuntu> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Babu Moger 提交于
Convert all the intercepts to one array of 32 bit vectors in vmcb_control_area. This makes it easy for future intercept vector additions. Also update trace functions. Signed-off-by: NBabu Moger <babu.moger@amd.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <159985250813.11252.5736581193881040525.stgit@bmoger-ubuntu> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Babu Moger 提交于
Modify intercept_exceptions to generic intercepts in vmcb_control_area. Use the generic vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept to set/clear/test the intercept_exceptions bits. Signed-off-by: NBabu Moger <babu.moger@amd.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <159985250037.11252.1361972528657052410.stgit@bmoger-ubuntu> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Babu Moger 提交于
Modify intercept_dr to generic intercepts in vmcb_control_area. Use the generic vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept to set/clear/test the intercept_dr bits. Signed-off-by: NBabu Moger <babu.moger@amd.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <159985249255.11252.10000868032136333355.stgit@bmoger-ubuntu> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Babu Moger 提交于
Change intercept_cr to generic intercepts in vmcb_control_area. Use the new vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept where applicable. Signed-off-by: NBabu Moger <babu.moger@amd.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <159985248506.11252.9081085950784508671.stgit@bmoger-ubuntu> [Change constant names. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Babu Moger 提交于
This is in preparation for the future intercept vector additions. Add new functions vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept using kernel APIs __set_bit, __clear_bit and test_bit espectively. Signed-off-by: NBabu Moger <babu.moger@amd.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <159985247876.11252.16039238014239824460.stgit@bmoger-ubuntu> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 9月, 2020 3 次提交
-
-
由 Borislav Petkov 提交于
Use the shorthand to make it more readable. No functional changes. Signed-off-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-5-joro@8bytes.org
-
由 Joerg Roedel 提交于
Building a correct GHCB for the hypervisor requires setting valid bits in the GHCB. Simplify that process by providing accessor functions to set values and to update the valid bitmap and to check the valid bitmap in KVM. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-4-joro@8bytes.org
-
由 Tom Lendacky 提交于
Extend the vmcb_safe_area with SEV-ES fields and add a new 'struct ghcb' which will be used for guest-hypervisor communication. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-3-joro@8bytes.org
-
- 04 9月, 2020 2 次提交
-
-
由 Peter Zijlstra 提交于
The WARN added in commit 3c73b81a ("x86/entry, selftests: Further improve user entry sanity checks") unconditionally triggers on a IVB machine because it does not support SMAP. For !SMAP hardware the CLAC/STAC instructions are patched out and thus if userspace sets AC, it is still have set after entry. Fixes: 3c73b81a ("x86/entry, selftests: Further improve user entry sanity checks") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NDaniel Thompson <daniel.thompson@linaro.org> Acked-by: NAndy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200902133200.666781610@infradead.org
-
由 Vamshi K Sthambamkadi 提交于
On i386, the order of parameters passed on regs is eax,edx,and ecx (as per regparm(3) calling conventions). Change the mapping in regs_get_kernel_argument(), so that arg1=ax arg2=dx, and arg3=cx. Running the selftests testcase kprobes_args_use.tc shows the result as passed. Fixes: 3c88ee19 ("x86: ptrace: Add function argument access API") Signed-off-by: NVamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200828113242.GA1424@cosmos
-
- 30 8月, 2020 1 次提交
-
-
由 Kyung Min Park 提交于
Intel TSX suspend load tracking instructions aim to give a way to choose which memory accesses do not need to be tracked in the TSX read set. Add TSX suspend load tracking CPUID feature flag TSXLDTRK for enumeration. A processor supports Intel TSX suspend load address tracking if CPUID.0x07.0x0:EDX[16] is present. Two instructions XSUSLDTRK, XRESLDTRK are available when this feature is present. The CPU feature flag is shown as "tsxldtrk" in /proc/cpuinfo. Signed-off-by: NKyung Min Park <kyung.min.park@intel.com> Signed-off-by: NCathy Zhang <cathy.zhang@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NTony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1598316478-23337-2-git-send-email-cathy.zhang@intel.com
-
- 26 8月, 2020 1 次提交
-
-
由 Peter Zijlstra 提交于
This allows moving the leave_mm() call into generic code before rcu_idle_enter(). Gets rid of more trace_*_rcuidle() users. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMarco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20200821085348.369441600@infradead.org
-
- 22 8月, 2020 1 次提交
-
-
由 Will Deacon 提交于
The 'flags' field of 'struct mmu_notifier_range' is used to indicate whether invalidate_range_{start,end}() are permitted to block. In the case of kvm_mmu_notifier_invalidate_range_start(), this field is not forwarded on to the architecture-specific implementation of kvm_unmap_hva_range() and therefore the backend cannot sensibly decide whether or not to block. Add an extra 'flags' parameter to kvm_unmap_hva_range() so that architectures are aware as to whether or not they are permitted to block. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: NWill Deacon <will@kernel.org> Message-Id: <20200811102725.7121-2-will@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 8月, 2020 1 次提交
-
-
由 Ard Biesheuvel 提交于
Now that the old memmap code has been removed, some code that was left behind in arch/x86/platform/efi/efi.c is only used for 32-bit builds, which means it can live in efi_32.c as well. So move it over. Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
-
- 19 8月, 2020 1 次提交
-
-
由 Ingo Molnar 提交于
- Fix typos. - Move the compiler barrier comment to the top, because it's valid for the whole function, not just the legacy branch. Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200818053130.GA3161093@gmail.comReviewed-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
-
- 18 8月, 2020 1 次提交
-
-
由 Uros Bizjak 提交于
Current minimum required version of binutils is 2.23, which supports XGETBV and XSETBV instruction mnemonics. Replace the byte-wise specification of XGETBV and XSETBV with these proper mnemonics. Signed-off-by: NUros Bizjak <ubizjak@gmail.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200707174722.58651-1-ubizjak@gmail.com
-
- 17 8月, 2020 1 次提交
-
-
由 Ricardo Neri 提交于
The SERIALIZE instruction gives software a way to force the processor to complete all modifications to flags, registers and memory from previous instructions and drain all buffered writes to memory before the next instruction is fetched and executed. Thus, it serves the purpose of sync_core(). Use it when available. Suggested-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NTony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200807032833.17484-1-ricardo.neri-calderon@linux.intel.com
-
- 13 8月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
segment_eq is only used to implement uaccess_kernel. Just open code uaccess_kernel in the arch uaccess headers and remove one layer of indirection. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NGreentime Hu <green.hu@gmail.com> Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 8月, 2020 2 次提交
-
-
由 Michael Kelley 提交于
Make hv_setup_sched_clock inline so the reference to pv_ops works correctly with objtool updates to detect noinstr violations. See https://lore.kernel.org/patchwork/patch/1283635/Signed-off-by: NMichael Kelley <mikelley@microsoft.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1597022991-24088-1-git-send-email-mikelley@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>
-
由 Juergen Gross 提交于
Xen is requiring 64-bit machines today and since Xen 4.14 it can be built without 32-bit PV guest support. There is no need to carry the burden of 32-bit PV guest support in the kernel any longer, as new guests can be either HVM or PVH, or they can use a 64 bit kernel. Remove the 32-bit Xen PV support from the kernel. Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: NJuergen Gross <jgross@suse.com>
-
- 08 8月, 2020 4 次提交
-
-
由 Mike Rapoport 提交于
Most architectures define pgd_free() as a wrapper for free_page(). Provide a generic version in asm-generic/pgalloc.h and enable its use for most architectures. Signed-off-by: NMike Rapoport <rppt@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NPekka Enberg <penberg@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200627143453.31835-7-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
Several architectures define pud_alloc_one() as a wrapper for __get_free_page() and pud_free() as a wrapper for free_page(). Provide a generic implementation in asm-generic/pgalloc.h and use it where appropriate. Signed-off-by: NMike Rapoport <rppt@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NPekka Enberg <penberg@kernel.org> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200627143453.31835-6-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
For most architectures that support >2 levels of page tables, pmd_alloc_one() is a wrapper for __get_free_pages(), sometimes with __GFP_ZERO and sometimes followed by memset(0) instead. More elaborate versions on arm64 and x86 account memory for the user page tables and call to pgtable_pmd_page_ctor() as the part of PMD page initialization. Move the arm64 version to include/asm-generic/pgalloc.h and use the generic version on several architectures. The pgtable_pmd_page_ctor() is a NOP when ARCH_ENABLE_SPLIT_PMD_PTLOCK is not enabled, so there is no functional change for most architectures except of the addition of __GFP_ACCOUNT for allocation of user page tables. The pmd_free() is a wrapper for free_page() in all the cases, so no functional change here. Signed-off-by: NMike Rapoport <rppt@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NPekka Enberg <penberg@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20200627143453.31835-5-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
Patch series "mm: cleanup usage of <asm/pgalloc.h>" Most architectures have very similar versions of pXd_alloc_one() and pXd_free_one() for intermediate levels of page table. These patches add generic versions of these functions in <asm-generic/pgalloc.h> and enable use of the generic functions where appropriate. In addition, functions declared and defined in <asm/pgalloc.h> headers are used mostly by core mm and early mm initialization in arch and there is no actual reason to have the <asm/pgalloc.h> included all over the place. The first patch in this series removes unneeded includes of <asm/pgalloc.h> In the end it didn't work out as neatly as I hoped and moving pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require unnecessary changes to arches that have custom page table allocations, so I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local to mm/. This patch (of 8): In most cases <asm/pgalloc.h> header is required only for allocations of page table memory. Most of the .c files that include that header do not use symbols declared in <asm/pgalloc.h> and do not require that header. As for the other header files that used to include <asm/pgalloc.h>, it is possible to move that include into the .c file that actually uses symbols from <asm/pgalloc.h> and drop the include from the header file. The process was somewhat automated using sed -i -E '/[<"]asm\/pgalloc\.h/d' \ $(grep -L -w -f /tmp/xx \ $(git grep -E -l '[<"]asm/pgalloc\.h')) where /tmp/xx contains all the symbols defined in arch/*/include/asm/pgalloc.h. [rppt@linux.ibm.com: fix powerpc warning] Signed-off-by: NMike Rapoport <rppt@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NPekka Enberg <penberg@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 8月, 2020 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit 8bb9bf24. It seems the vmalloc page tables aren't always preallocated in all situations, because Jason Donenfeld reports an oops with this commit: BUG: unable to handle page fault for address: ffffe8ffffd00608 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 2 PID: 22 Comm: kworker/2:0 Not tainted 5.8.0+ #154 RIP: process_one_work+0x2c/0x2d0 Code: 41 56 41 55 41 54 55 48 89 f5 53 48 89 fb 48 83 ec 08 48 8b 06 4c 8b 67 40 49 89 c6 45 30 f6 a8 04 b8 00 00 00 00 4c 0f 44 f0 <49> 8b 46 08 44 8b a8 00 01 05 Call Trace: worker_thread+0x4b/0x3b0 ? rescuer_thread+0x360/0x360 kthread+0x116/0x140 ? __kthread_create_worker+0x110/0x110 ret_from_fork+0x1f/0x30 CR2: ffffe8ffffd00608 and that page fault address is right in that vmalloc space, and we clearly don't have a PGD/P4D entry for it. Looking at the "Code:" line, the actual fault seems to come from the 'pwq->wq' dereference at the top of the process_one_work() function: struct pool_workqueue *pwq = get_work_pwq(work); struct worker_pool *pool = worker->pool; bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE; so 'struct pool_workqueue *pwq' is the allocation that hasn't been synchronized across CPUs. Just revert for now, while Joerg figures out the cause. Reported-and-bisected-by: NJason A. Donenfeld <Jason@zx2c4.com> Acked-by: NIngo Molnar <mingo@kernel.org> Acked-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 8月, 2020 3 次提交
-
-
由 Peter Zijlstra 提交于
By using lockdep_assert_*() from seqlock.h, the spaghetti monster attacked. Attack back by reducing seqlock.h dependencies from two key high level headers: - <linux/seqlock.h>: -Remove <linux/ww_mutex.h> - <linux/time.h>: -Remove <linux/seqlock.h> - <linux/sched.h>: +Add <linux/seqlock.h> The price was to add it to sched.h ... Core header fallout, we add direct header dependencies instead of gaining them parasitically from higher level headers: - <linux/dynamic_queue_limits.h>: +Add <asm/bug.h> - <linux/hrtimer.h>: +Add <linux/seqlock.h> - <linux/ktime.h>: +Add <asm/bug.h> - <linux/lockdep.h>: +Add <linux/smp.h> - <linux/sched.h>: +Add <linux/seqlock.h> - <linux/videodev2.h>: +Add <linux/kernel.h> Arch headers fallout: - PARISC: <asm/timex.h>: +Add <asm/special_insns.h> - SH: <asm/io.h>: +Add <asm/page.h> - SPARC: <asm/timer_64.h>: +Add <uapi/asm/asi.h> - SPARC: <asm/vvar.h>: +Add <asm/processor.h>, <asm/barrier.h> -Remove <linux/seqlock.h> - X86: <asm/fixmap.h>: +Add <asm/pgtable_types.h> -Remove <asm/acpi.h> There's also a bunch of parasitic header dependency fallout in .c files, not listed separately. [ mingo: Extended the changelog, split up & fixed the original patch. ] Co-developed-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200804133438.GK2674@hirez.programming.kicks-ass.net
-
由 Ingo Molnar 提交于
The APIC headers are relatively complex and bring in additional header dependencies - while smp.h is a relatively simple header included from high level headers. Remove the dependency and add in the missing #include's in .c files where they gained it indirectly before. Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
MIPS already uses and S390 will need the vdso data pointer in __arch_get_hw_counter(). This works nicely as long as the architecture does not support time namespaces in the VDSO. With time namespaces enabled the regular accessor to the vdso data pointer __arch_get_vdso_data() will return the namespace specific VDSO data page for tasks which are part of a non-root time namespace. This would cause the architectures which need the vdso data pointer in __arch_get_hw_counter() to access the wrong vdso data page. Add a vdso_data pointer argument to __arch_get_hw_counter() and hand it in from the call sites in the core code. For architectures which do not need the data pointer in their counter accessor function the compiler will just optimize it out. Fix up all existing architecture implementations and make MIPS utilize the pointer instead of invoking the accessor function. No functional change and no change in the resulting object code (except MIPS). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/draft-87wo2ekuzn.fsf@nanos.tec.linutronix.de
-
- 03 8月, 2020 1 次提交
-
-
由 Randy Dunlap 提交于
Change the repeated word "as" to "as a". Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Reviewed-by: NJuergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: xen-devel@lists.xenproject.org Link: https://lore.kernel.org/r/20200726001731.19540-1-rdunlap@infradead.orgSigned-off-by: NJuergen Gross <jgross@suse.com>
-
- 31 7月, 2020 2 次提交
-
-
由 Nick Terrell 提交于
- Add support for zstd compressed kernel - Define __DISABLE_EXPORTS in Makefile - Remove __DISABLE_EXPORTS definition from kaslr.c - Bump the heap size for zstd. - Update the documentation. Integrates the ZSTD decompression code to the x86 pre-boot code. Zstandard requires slightly more memory during the kernel decompression on x86 (192 KB vs 64 KB), and the memory usage is independent of the window size. __DISABLE_EXPORTS is now defined in the Makefile, which covers both the existing use in kaslr.c, and the use needed by the zstd decompressor in misc.c. This patch has been boot tested with both a zstd and gzip compressed kernel on i386 and x86_64 using buildroot and QEMU. Additionally, this has been tested in production on x86_64 devices. We saw a 2 second boot time reduction by switching kernel compression from xz to zstd. Signed-off-by: NNick Terrell <terrelln@fb.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Tested-by: NSedat Dilek <sedat.dilek@gmail.com> Reviewed-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200730190841.2071656-7-nickrterrell@gmail.com
-
由 Sean Christopherson 提交于
Capture the max TDP level during kvm_configure_mmu() instead of using a kvm_x86_ops hook to do it at every vCPU creation. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200716034122.5998-10-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-