- 26 5月, 2021 1 次提交
-
-
由 Masahiro Yamada 提交于
arch/$(SRCARCH)/Kbuild is useful for Makefile cleanups because you can use the obj-y syntax. Add an empty file if it is missing in arch/$(SRCARCH)/. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 24 5月, 2021 4 次提交
-
-
由 Masahiro Yamada 提交于
I do not see a good reason why only the libelf development package must be so carefully checked. Kbuild generally does not check host tools or libraries. For example, x86_64 defconfig fails to build with no libssl development package installed. scripts/extract-cert.c:21:10: fatal error: openssl/bio.h: No such file or directory 21 | #include <openssl/bio.h> | ^~~~~~~~~~~~~~~ To solve the build error, you need to install libssl-dev or openssl-devel package, depending on your distribution. 'apt-file search', 'dnf provides', etc. is your frined to find a proper package to install. This commit removes all the libelf checks from the top Makefile. If libelf is missing, objtool will fail to build in a similar pattern: .../linux/tools/objtool/include/objtool/elf.h:10:10: fatal error: gelf.h: No such file or directory 10 | #include <gelf.h> You need to install libelf-dev, libelf-devel, or elfutils-libelf-devel to proceed. Another remarkable change is, CONFIG_STACK_VALIDATION (without CONFIG_UNWINDER_ORC) previously continued to build with a warning, but now it will treat missing libelf as an error. This is just a one-time installation, so it should not hurt to break a build and make a user install the package. BTW, the traditional way to handle such checks is autotool, but according to [1], I do not expect the kernel build would have similar scripting like './configure' does. [1]: https://lore.kernel.org/lkml/CA+55aFzr2HTZVOuzpHYDwmtRJLsVzE-yqg2DHpHi_9ePsYp5ug@mail.gmail.com/Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org>
-
由 Masahiro Yamada 提交于
The tools/ directory only exists in the kernel source tree, not in external modules. Do not expose the meaningless targets to external modules. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Feng Tang 提交于
Commit 09c60546 ("./Makefile: add debug option to enable function aligned on 32 bytes") was introduced to help debugging strange kernel performance changes caused by code alignment change. Recently we found 2 similar cases [1][2] caused by code-alignment changes, which can only be identified by forcing 64 bytes aligned for all functions. Originally, 32 bytes was used mainly for not wasting too much text space, but this option is only for debug anyway where text space is not a big concern. So extend the alignment to 64 bytes to cover more similar cases. [1].https://lore.kernel.org/lkml/20210427090013.GG32408@xsang-OptiPlex-9020/ [2].https://lore.kernel.org/lkml/20210420030837.GB31773@xsang-OptiPlex-9020/Signed-off-by: NFeng Tang <feng.tang@intel.com> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Linus Torvalds 提交于
-
- 17 5月, 2021 1 次提交
-
-
由 Linus Torvalds 提交于
-
- 10 5月, 2021 1 次提交
-
-
由 Linus Torvalds 提交于
-
- 06 5月, 2021 3 次提交
-
-
由 Masahiro Yamada 提交于
The supported targets for external modules are listed in the help target a few lines below. Let's not have duplicated information in two places. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
'make tags' and friends create tag files in the top directory, but people may manually create tag files in sub-directories. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
This reverts the old commit [1], which seems questionable to me. It claimed 'make distclean' could not remove editor backup files, but I believe KBUILD_OUTPUT or O= was set. When O= is given, Kbuild should always work against $(objtree). If O= is not given, $(objtree) and $(srctree) are the same, therefore $(srctree) is cleaned up. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=dd47df980c02eb33833b2690b033c34fba2fa80dSigned-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 03 5月, 2021 1 次提交
-
-
由 Masahiro Yamada 提交于
Commit 37744fee ("sh: remove sh5 support") removed the SUPERH64 support entirely. Remove the left-over code from the top Makefile. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Acked-by: NArnd Bergmann <arnd@arndb.de>
-
- 01 5月, 2021 2 次提交
-
-
由 Nathan Chancellor 提交于
Currently, -Wunused-but-set-variable is only supported by GCC so it is disabled unconditionally in a GCC only block (it is enabled with W=1). clang currently has its implementation for this warning in review so preemptively move this statement out of the GCC only block and wrap it with cc-disable-warning so that both compilers function the same. Cc: stable@vger.kernel.org Link: https://reviews.llvm.org/D100581Signed-off-by: NNathan Chancellor <nathan@kernel.org> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Tested-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
We maintain .gitignore and Makefiles so build artifacts are properly ignored by Git, and cleaned up by 'make clean'. However, the code is always changing; generated files are often moved to another directory, or removed when they become unnecessary. Such garbage files tend to be left over in the source tree because people usually git-pull without cleaning the tree. This is not only the noise for 'git status', but also a build issue in some cases. One solution is to remove a stale file like commit 223c24a7 ("kbuild: Automatically remove stale <linux/version.h> file") did. Such workaround should be removed after a while, but we forget about that if we scatter the workaround code in random places. So, this commit adds a new script to collect cleanings of stale files. As a start point, move the code in arch/arm/boot/compressed/Makefile into this script. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 26 4月, 2021 1 次提交
-
-
由 Linus Torvalds 提交于
-
- 25 4月, 2021 18 次提交
-
-
由 Nathan Chancellor 提交于
Normally, invocations of $(HOSTCC) include $(KBUILD_HOSTLDFLAGS), which in turn includes $(HOSTLDFLAGS), which allows users to pass in their own flags when linking. However, the 'has_libelf' test does not, meaning that if a user requests a specific linker via HOSTLDFLAGS=-fuse-ld=..., it is not respected and the build might error. For example, if a user building with clang wants to use all of the LLVM tools without any GNU tools, they might remove all of the GNU tools from their system or PATH then build with $ make HOSTLDFLAGS=-fuse-ld=lld LLVM=1 LLVM_IAS=1 which says use all of the LLVM tools, the integrated assembler, and ld.lld for linking host executables. Without this change, the build will error because $(HOSTCC) uses its default linker, rather than the one requested via -fuse-ld=..., which is GNU ld in clang's case in a default configuration. error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel make[1]: *** [Makefile:1260: prepare-objtool] Error 1 Add $(KBUILD_HOSTLDFLAGS) to the 'has_libelf' test so that the linker choice is respected. Link: https://github.com/ClangBuiltLinux/linux/issues/479Signed-off-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
scripts/Makefile.modsign is a subset of scripts/Makefile.modinst, and duplicates the code. Let's merge them. By the way, you do not need to run 'make modules_sign' explicitly because modules are signed as a part of 'make modules_install' when CONFIG_MODULE_SIG_ALL=y. If CONFIG_MODULE_SIG_ALL=n, mod_sign_cmd is set to 'true', so 'make modules_sign' is not functional. In my understanding, the reason of still keeping this is to handle corner cases like commit 64178cb6 ("builddeb: fix stripped module signatures if CONFIG_DEBUG_INFO and CONFIG_MODULE_SIG_ALL are set"). Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
Both mod_strip_cmd and mod_compress_cmd are only used in scripts/Makefile.modinst, hence there is no good reason to define them in the top Makefile. Move the relevant code to scripts/Makefile.modinst. Also, show separate log messages for each of install, strip, sign, and compress. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
scripts/Makefile.modinst is ugly and weird in multiple ways; it specifies real files $(modules) as phony, makes directory manipulation needlessly too complicated. Clean up the Makefile code, and show the full path of installed modules in the log. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
This seems to be useful in sub-make as well. As a preparation of exporting it, rename extmod-prefix to extmod_prefix because exported variables cannot contain hyphens. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
-
由 Masahiro Yamada 提交于
If there are multiple modules with the same name in the same external module tree, there is ambiguity about which one will be loaded, and very likely something odd is happening. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
It is clearer to show the directory which depmod will work on. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
If you attempt to build or install modules ('make modules(_install)' with CONFIG_MODULES disabled, you will get a clear error message, but nothing for external module builds. Factor out the modules and modules_install rules into the common part, so you will get the same error message when you try to build external modules with CONFIG_MODULES=n. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
scripts/Makefile.modinst creates directories as needed. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
The external module build shows the following warning if Module.symvers is missing in the kernel tree. WARNING: Symbol version dump "Module.symvers" is missing. Modules may not have dependencies or modversions. I think this is an important heads-up because the resulting modules may not work as expected. This happens when you did not build the entire kernel tree, for example, you might have prepared the minimal setups for external modules by 'make defconfig && make modules_preapre'. A problem is that 'make modules' creates Module.symvers even without vmlinux. In this case, that warning is suppressed since Module.symvers already exists in spite of its incomplete content. The incomplete (i.e. invalid) Module.symvers should not be created. This commit changes the second pass of modpost to dump symbols into modules-only.symvers. The final Module.symvers is created by concatenating vmlinux.symvers and modules-only.symvers if both exist. Module.symvers is supposed to collect symbols from both vmlinux and modules. It might be a bit confusing, and I am not quite sure if it is an official interface, but presumably it is difficult to rename it because some tools (e.g. kmod) parse it. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
Documentation/process/changes.rst defines the minimum assembler version (binutils version), but we have never checked it in the build time. Kbuild never invokes 'as' directly because all assembly files in the kernel tree are *.S, hence must be preprocessed. I do not expect raw assembly source files (*.s) would be added to the kernel tree. Therefore, we always use $(CC) as the assembler driver, and commit aa824e0c ("kbuild: remove AS variable") removed 'AS'. However, we are still interested in the version of the assembler acting behind. As usual, the --version option prints the version string. $ as --version | head -n 1 GNU assembler (GNU Binutils for Ubuntu) 2.35.1 But, we do not have $(AS). So, we can add the -Wa prefix so that $(CC) passes --version down to the backing assembler. $ gcc -Wa,--version | head -n 1 gcc: fatal error: no input files compilation terminated. OK, we need to input something to satisfy gcc. $ gcc -Wa,--version -c -x assembler /dev/null -o /dev/null | head -n 1 GNU assembler (GNU Binutils for Ubuntu) 2.35.1 The combination of Clang and GNU assembler works in the same way: $ clang -no-integrated-as -Wa,--version -c -x assembler /dev/null -o /dev/null | head -n 1 GNU assembler (GNU Binutils for Ubuntu) 2.35.1 Clang with the integrated assembler fails like this: $ clang -integrated-as -Wa,--version -c -x assembler /dev/null -o /dev/null | head -n 1 clang: error: unsupported argument '--version' to option 'Wa,' For the last case, checking the error message is fragile. If the proposal for -Wa,--version support [1] is accepted, this may not be even an error in the future. One easy way is to check if -integrated-as is present in the passed arguments. We did not pass -integrated-as to CLANG_FLAGS before, but we can make it explicit. Nathan pointed out -integrated-as is the default for all of the architectures/targets that the kernel cares about, but it goes along with "explicit is better than implicit" policy. [2] With all this in my mind, I implemented scripts/as-version.sh to check the assembler version in Kconfig time. $ scripts/as-version.sh gcc GNU 23501 $ scripts/as-version.sh clang -no-integrated-as GNU 23501 $ scripts/as-version.sh clang -integrated-as LLVM 0 [1]: https://github.com/ClangBuiltLinux/linux/issues/1320 [2]: https://lore.kernel.org/linux-kbuild/20210307044253.v3h47ucq6ng25iay@archlinux-ax161/Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Reviewed-by: NNathan Chancellor <nathan@kernel.org>
-
由 Masahiro Yamada 提交于
For simple text replacement, it is better to use a built-in function instead of sed if possible. You can save one process forking. I do not mean to replace all sed invocations because GNU Make itself does not support regular expression (unless you use guile). I just replaced simple ones. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Nathan Chancellor 提交于
When building with LLVM_IAS=1, there is no point to specifying '--prefix=' because that flag is only used to find GNU cross tools, which will not be used indirectly when using the integrated assembler. All of the tools are invoked directly from PATH or a full path specified via the command line, which does not depend on the value of '--prefix='. Sharing commands to reproduce issues becomes a little bit easier without a '--prefix=' value because that '--prefix=' value is specific to a user's machine due to it being an absolute path. Some further notes from Fangrui Song: clang can spawn GNU as (if -f?no-integrated-as is specified) and GNU objcopy (-f?no-integrated-as and -gsplit-dwarf and -g[123]). objcopy is only used for GNU as assembled object files. With integrated assembler, the object file streamer creates .o and .dwo simultaneously. With GNU as, two objcopy commands are needed to extract .debug*.dwo to .dwo files && another command to remove .debug*.dwo sections. A small consequence of this change (to keep things simple) is that '--prefix=' will always be specified now, even with a native build, when it was not before. This should not be an issue due to the way that the Makefile searches for the prefix (based on elfedit's location). This ends up improving the experience for host builds because PATH is better respected and matches GCC's behavior more closely. See the below thread for more details: https://lore.kernel.org/r/20210205213651.GA16907@Ryzen-5-4500U.localdomain/Signed-off-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Nathan Chancellor 提交于
This flag was originally added to allow clang to find the GNU cross tools in commit 785f11aa ("kbuild: Add better clang cross build support"). This flag was not enough to find the tools at times so '--prefix' was added to the list in commit ef8c4ed9 ("kbuild: allow to use GCC toolchain not in Clang search path") and improved upon in commit ca9b31f6 ("Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation"). Now that '--prefix' specifies a full path and prefix, '--gcc-toolchain' serves no purpose because the kernel builds with '-nostdinc' and '-nostdlib'. This has been verified with self compiled LLVM 10.0.1 and LLVM 13.0.0 as well as a distribution version of LLVM 11.1.0 without binutils in the LLVM toolchain locations. Link: https://reviews.llvm.org/D97902Signed-off-by: NNathan Chancellor <nathan@kernel.org> Reviewed-by: NFangrui Song <maskray@google.com> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Tested-by: NNick Desaulniers <ndesaulniers@google.com> Tested-by: NSedat Dilek <sedat.dilek@gmail.com> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Rasmus Villemoes 提交于
The patch adding CONFIG_VMLINUX_MAP revealed a small defect in the build system: link-vmlinux.sh takes decisions based on CONFIG_* options, but changing one of those does not always lead to vmlinux being linked again. For most of the CONFIG_* knobs referenced previously, this has probably been hidden by those knobs also affecting some object file, hence indirectly also vmlinux. But CONFIG_VMLINUX_MAP is only handled inside link-vmlinux.sh, and changing CONFIG_VMLINUX_MAP=n to CONFIG_VMLINUX_MAP=y does not cause the build system to re-link (and hence have vmlinux.map emitted). Since that map file is mostly a debugging aid, this is merely a nuisance which is easily worked around by just deleting vmlinux and building again. But one could imagine other (possibly future) CONFIG options that actually do affect the vmlinux binary but which are not captured through some object file dependency. To fix this, make link-vmlinux.sh emit a .vmlinux.d file in the same format as the dependency files generated by gcc, and apply the fixdep logic to that. I've tested that this correctly works with both in-tree and out-of-tree builds. Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
Since commit 7ecaf069 ("kbuild: move headers_check rule to usr/include/Makefile"), 'make headers_check' is no-op. This stub target is remaining here in case some scripts still invoke it. In order to prompt people to remove stale code, show a noisy warning message if used. The stub will be really removed after the Linux 5.15 release. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
Since commit f2f02ebd ("kbuild: improve cc-option to clean up all temporary files"), running 'make kernelversion' in a read-only source tree emits a bunch of warnings: mkdir: cannot create directory '.tmp_12345': Permission denied No-build targets such as kernelversion, clean, help, etc. do not need to evaluate $(call cc-option,) or friends. Skip Makefile.compiler so $(call cc-option,) becomes no-op. This not only fixes the warnings, but also runs non-build targets much faster. Basically, all installation targets should also be non-build targets. Unfortunately, vdso_install requires the compiler because it builds vdso before installation. This is a problem that must be fixed by a separate patch. Reported-by: NIsrael Tsadok <itsadok@gmail.com> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
scripts/Kbuild.include is included everywhere, but macros such as cc-option are needed by build targets only. For example, when 'make clean' traverses the tree, it does not need to evaluate $(call cc-option,). Split cc-option, ld-option, etc. to scripts/Makefile.compiler, which is only included from the top Makefile and scripts/Makefile.build. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 19 4月, 2021 1 次提交
-
-
由 Linus Torvalds 提交于
-
- 14 4月, 2021 2 次提交
-
-
由 Masahiro Yamada 提交于
When the .config file is missing, 'make config', 'make menuconfig', etc. uses a file listed in DEFCONFIG_LIST, if found, as base configuration. Ususally, /boot/config-$(uname -r) exists, and is used as default. However, when you are cross-compiling the kernel, it does not make sense to use /boot/config-* on the build host. It should default to arch/$(SRCARCH)/configs/$(KBUILD_DEFCONFIG). UML previously did not use DEFCONFIG_LIST at all, but it should be able to use arch/um/configs/$(KBUILD_DEFCONFIG) as a base config file. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
This is a partial revert of commit 2a86f661 ("kbuild: use KBUILD_DEFCONFIG as the fallback for DEFCONFIG_LIST"). Now that the reference to $(DEFCONFIG_LIST) was removed from init/Kconfig, the default KBUILD_DEFCONFIG can go back home. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 12 4月, 2021 1 次提交
-
-
由 Linus Torvalds 提交于
-
- 09 4月, 2021 2 次提交
-
-
由 Nayna Jain 提交于
The "mrproper" target is still looking for build time generated keys in the kernel root directory instead of certs directory. Fix the path and remove the names of the files which are no longer generated. Fixes: cfc411e7 ("Move certificate handling to its own directory") Signed-off-by: NNayna Jain <nayna@linux.ibm.com> Reviewed-by: NStefan Berger <stefanb@linux.ibm.com> Reviewed-by: NJarkko Sakkinen <jarkko@kernel.org> Signed-off-by: NMimi Zohar <zohar@linux.ibm.com>
-
由 Sami Tolvanen 提交于
This change adds support for Clang’s forward-edge Control Flow Integrity (CFI) checking. With CONFIG_CFI_CLANG, the compiler injects a runtime check before each indirect function call to ensure the target is a valid function with the correct static type. This restricts possible call targets and makes it more difficult for an attacker to exploit bugs that allow the modification of stored function pointers. For more details, see: https://clang.llvm.org/docs/ControlFlowIntegrity.html Clang requires CONFIG_LTO_CLANG to be enabled with CFI to gain visibility to possible call targets. Kernel modules are supported with Clang’s cross-DSO CFI mode, which allows checking between independently compiled components. With CFI enabled, the compiler injects a __cfi_check() function into the kernel and each module for validating local call targets. For cross-module calls that cannot be validated locally, the compiler calls the global __cfi_slowpath_diag() function, which determines the target module and calls the correct __cfi_check() function. This patch includes a slowpath implementation that uses __module_address() to resolve call targets, and with CONFIG_CFI_CLANG_SHADOW enabled, a shadow map that speeds up module look-ups by ~3x. Clang implements indirect call checking using jump tables and offers two methods of generating them. With canonical jump tables, the compiler renames each address-taken function to <function>.cfi and points the original symbol to a jump table entry, which passes __cfi_check() validation. This isn’t compatible with stand-alone assembly code, which the compiler doesn’t instrument, and would result in indirect calls to assembly code to fail. Therefore, we default to using non-canonical jump tables instead, where the compiler generates a local jump table entry <function>.cfi_jt for each address-taken function, and replaces all references to the function with the address of the jump table entry. Note that because non-canonical jump table addresses are local to each component, they break cross-module function address equality. Specifically, the address of a global function will be different in each module, as it's replaced with the address of a local jump table entry. If this address is passed to a different module, it won’t match the address of the same function taken there. This may break code that relies on comparing addresses passed from other components. CFI checking can be disabled in a function with the __nocfi attribute. Additionally, CFI can be disabled for an entire compilation unit by filtering out CC_FLAGS_CFI. By default, CFI failures result in a kernel panic to stop a potential exploit. CONFIG_CFI_PERMISSIVE enables a permissive mode, where the kernel prints out a rate-limited warning instead, and allows execution to continue. This option is helpful for locating type mismatches, but should only be enabled during development. Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Reviewed-by: NKees Cook <keescook@chromium.org> Tested-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-2-samitolvanen@google.com
-
- 08 4月, 2021 1 次提交
-
-
由 Kees Cook 提交于
This provides the ability for architectures to enable kernel stack base address offset randomization. This feature is controlled by the boot param "randomize_kstack_offset=on/off", with its default value set by CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. This feature is based on the original idea from the last public release of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt All the credit for the original idea goes to the PaX team. Note that the design and implementation of this upstream randomize_kstack_offset feature differs greatly from the RANDKSTACK feature (see below). Reasoning for the feature: This feature aims to make harder the various stack-based attacks that rely on deterministic stack structure. We have had many such attacks in past (just to name few): https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html As Linux kernel stack protections have been constantly improving (vmap-based stack allocation with guard pages, removal of thread_info, STACKLEAK), attackers have had to find new ways for their exploits to work. They have done so, continuing to rely on the kernel's stack determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT were not relevant. For example, the following recent attacks would have been hampered if the stack offset was non-deterministic between syscalls: https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf (page 70: targeting the pt_regs copy with linear stack overflow) https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html (leaked stack address from one syscall as a target during next syscall) The main idea is that since the stack offset is randomized on each system call, it is harder for an attack to reliably land in any particular place on the thread stack, even with address exposures, as the stack base will change on the next syscall. Also, since randomization is performed after placing pt_regs, the ptrace-based approach[1] to discover the randomized offset during a long-running syscall should not be possible. Design description: During most of the kernel's execution, it runs on the "thread stack", which is pretty deterministic in its structure: it is fixed in size, and on every entry from userspace to kernel on a syscall the thread stack starts construction from an address fetched from the per-cpu cpu_current_top_of_stack variable. The first element to be pushed to the thread stack is the pt_regs struct that stores all required CPU registers and syscall parameters. Finally the specific syscall function is called, with the stack being used as the kernel executes the resulting request. The goal of randomize_kstack_offset feature is to add a random offset after the pt_regs has been pushed to the stack and before the rest of the thread stack is used during the syscall processing, and to change it every time a process issues a syscall. The source of randomness is currently architecture-defined (but x86 is using the low byte of rdtsc()). Future improvements for different entropy sources is possible, but out of scope for this patch. Further more, to add more unpredictability, new offsets are chosen at the end of syscalls (the timing of which should be less easy to measure from userspace than at syscall entry time), and stored in a per-CPU variable, so that the life of the value does not stay explicitly tied to a single task. As suggested by Andy Lutomirski, the offset is added using alloca() and an empty asm() statement with an output constraint, since it avoids changes to assembly syscall entry code, to the unwinder, and provides correct stack alignment as defined by the compiler. In order to make this available by default with zero performance impact for those that don't want it, it is boot-time selectable with static branches. This way, if the overhead is not wanted, it can just be left turned off with no performance impact. The generated assembly for x86_64 with GCC looks like this: ... ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax # 12380 <kstack_offset> ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax ffffffff81003983: 48 83 c0 0f add $0xf,%rax ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax ffffffff8100398c: 48 29 c4 sub %rax,%rsp ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax ... As a result of the above stack alignment, this patch introduces about 5 bits of randomness after pt_regs is spilled to the thread stack on x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for stack alignment). The amount of entropy could be adjusted based on how much of the stack space we wish to trade for security. My measure of syscall performance overhead (on x86_64): lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null randomize_kstack_offset=y Simple syscall: 0.7082 microseconds randomize_kstack_offset=n Simple syscall: 0.7016 microseconds So, roughly 0.9% overhead growth for a no-op syscall, which is very manageable. And for people that don't want this, it's off by default. There are two gotchas with using the alloca() trick. First, compilers that have Stack Clash protection (-fstack-clash-protection) enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to any dynamic stack allocations. While the randomization offset is always less than a page, the resulting assembly would still contain (unreachable!) probing routines, bloating the resulting assembly. To avoid this, -fno-stack-clash-protection is unconditionally added to the kernel Makefile since this is the only dynamic stack allocation in the kernel (now that VLAs have been removed) and it is provably safe from Stack Clash style attacks. The second gotcha with alloca() is a negative interaction with -fstack-protector*, in that it sees the alloca() as an array allocation, which triggers the unconditional addition of the stack canary function pre/post-amble which slows down syscalls regardless of the static branch. In order to avoid adding this unneeded check and its associated performance impact, architectures need to carefully remove uses of -fstack-protector-strong (or -fstack-protector) in the compilation units that use the add_random_kstack() macro and to audit the resulting stack mitigation coverage (to make sure no desired coverage disappears). No change is visible for this on x86 because the stack protector is already unconditionally disabled for the compilation unit, but the change is required on arm64. There is, unfortunately, no attribute that can be used to disable stack protector for specific functions. Comparison to PaX RANDKSTACK feature: The RANDKSTACK feature randomizes the location of the stack start (cpu_current_top_of_stack), i.e. including the location of pt_regs structure itself on the stack. Initially this patch followed the same approach, but during the recent discussions[2], it has been determined to be of a little value since, if ptrace functionality is available for an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write different offsets in the pt_regs struct, observe the cache behavior of the pt_regs accesses, and figure out the random stack offset. Another difference is that the random offset is stored in a per-cpu variable, rather than having it be per-thread. As a result, these implementations differ a fair bit in their implementation details and results, though obviously the intent is similar. [1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/ [2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/ [3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.htmlCo-developed-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org
-
- 05 4月, 2021 1 次提交
-
-
由 Linus Torvalds 提交于
-