- 31 3月, 2020 1 次提交
-
-
由 KP Singh 提交于
LSM and tracing programs share their helpers with bpf_tracing_func_proto which is only defined (in bpf_trace.c) when BPF_EVENTS is enabled. Instead of adding __weak symbol, make BPF_LSM depend on BPF_EVENTS so that both tracing and LSM programs can actually share helpers. Fixes: fc611f47 ("bpf: Introduce BPF_PROG_TYPE_LSM") Reported-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200330204059.13024-1-kpsingh@chromium.org
-
- 30 3月, 2020 1 次提交
-
-
由 KP Singh 提交于
Introduce types and configs for bpf programs that can be attached to LSM hooks. The programs can be enabled by the config option CONFIG_BPF_LSM. Signed-off-by: NKP Singh <kpsingh@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NBrendan Jackman <jackmanb@google.com> Reviewed-by: NFlorent Revest <revest@google.com> Reviewed-by: NThomas Garnier <thgarnie@google.com> Acked-by: NYonghong Song <yhs@fb.com> Acked-by: NAndrii Nakryiko <andriin@fb.com> Acked-by: NJames Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
-
- 27 3月, 2020 1 次提交
-
-
由 Namhyung Kim 提交于
The PERF_SAMPLE_CGROUP bit is to save (perf_event) cgroup information in the sample. It will add a 64-bit id to identify current cgroup and it's the file handle in the cgroup file system. Userspace should use this information with PERF_RECORD_CGROUP event to match which cgroup it belongs. I put it before PERF_SAMPLE_AUX for simplicity since it just needs a 64-bit word. But if we want bigger samples, I can work on that direction too. Committer testing: $ pahole perf_sample_data | grep -w cgroup -B5 -A5 /* --- cacheline 4 boundary (256 bytes) was 56 bytes ago --- */ struct perf_regs regs_intr; /* 312 16 */ /* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */ u64 stack_user_size; /* 328 8 */ u64 phys_addr; /* 336 8 */ u64 cgroup; /* 344 8 */ /* size: 384, cachelines: 6, members: 22 */ /* padding: 32 */ }; $ Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NTejun Heo <tj@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lore.kernel.org/lkml/20200325124536.2800725-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 3月, 2020 1 次提交
-
-
由 Eric W. Biederman 提交于
The cred_guard_mutex is problematic as it is held over possibly indefinite waits for userspace. The possible indefinite waits for userspace that I have identified are: The cred_guard_mutex is held in PTRACE_EVENT_EXIT waiting for the tracer. The cred_guard_mutex is held over "put_user(0, tsk->clear_child_tid)" in exit_mm(). The cred_guard_mutex is held over "get_user(futex_offset, ...") in exit_robust_list. The cred_guard_mutex held over copy_strings. The functions get_user and put_user can trigger a page fault which can potentially wait indefinitely in the case of userfaultfd or if userspace implements part of the page fault path. In any of those cases the userspace process that the kernel is waiting for might make a different system call that winds up taking the cred_guard_mutex and result in deadlock. Holding a mutex over any of those possibly indefinite waits for userspace does not appear necessary. Add exec_update_mutex that will just cover updating the process during exec where the permissions and the objects pointed to by the task struct may be out of sync. The plan is to switch the users of cred_guard_mutex to exec_update_mutex one by one. This lets us move forward while still being careful and not introducing any regressions. Link: https://lore.kernel.org/lkml/20160921152946.GA24210@dhcp22.suse.cz/ Link: https://lore.kernel.org/lkml/AM6PR03MB5170B06F3A2B75EFB98D071AE4E60@AM6PR03MB5170.eurprd03.prod.outlook.com/ Link: https://lore.kernel.org/linux-fsdevel/20161102181806.GB1112@redhat.com/ Link: https://lore.kernel.org/lkml/20160923095031.GA14923@redhat.com/ Link: https://lore.kernel.org/lkml/20170213141452.GA30203@redhat.com/ Ref: 45c1a159b85b ("Add PTRACE_O_TRACEVFORKDONE and PTRACE_O_TRACEEXIT facilities.") Ref: 456f17cd1a28 ("[PATCH] user-vm-unlock-2.5.31-A2") Reviewed-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NBernd Edlinger <bernd.edlinger@hotmail.de> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
-
- 24 3月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
There is no good reason for __bdevname to exist. Just open code printing the string in the callers. For three of them the format string can be trivially merged into existing printk statements, and in init/do_mounts.c we can at least do the scnprintf once at the start of the function, and unconditional of CONFIG_BLOCK to make the output for tiny configfs a little more helpful. Acked-by: Theodore Ts'o <tytso@mit.edu> # for ext4 Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 3月, 2020 1 次提交
-
-
由 Masahiro Yamada 提交于
The support for __uint128_t is dependent on the target bit size. GCC that defaults to the 32-bit can still build the 64-bit kernel with -m64 flag passed. However, $(cc-option,-D__SIZEOF_INT128__=0) is evaluated against the default machine bit, which may not match to the kernel it is building. Theoretically, this could be evaluated separately for 64BIT/32BIT. config CC_HAS_INT128 bool default !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0) if 64BIT default !$(cc-option,$(m32-flag) -D__SIZEOF_INT128__=0) I simplified it more because the 32-bit compiler is unlikely to support __uint128_t. Fixes: c12d3362 ("int128: move __uint128_t compiler test to Kconfig") Reported-by: NGeorge Spelvin <lkml@sdf.org> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Tested-by: NGeorge Spelvin <lkml@sdf.org>
-
- 06 3月, 2020 1 次提交
-
-
由 Thara Gopinath 提交于
Extrapolating on the existing framework to track rt/dl utilization using pelt signals, add a similar mechanism to track thermal pressure. The difference here from rt/dl utilization tracking is that, instead of tracking time spent by a CPU running a RT/DL task through util_avg, the average thermal pressure is tracked through load_avg. This is because thermal pressure signal is weighted time "delta" capacity unlike util_avg which is binary. "delta capacity" here means delta between the actual capacity of a CPU and the decreased capacity a CPU due to a thermal event. In order to track average thermal pressure, a new sched_avg variable avg_thermal is introduced. Function update_thermal_load_avg can be called to do the periodic bookkeeping (accumulate, decay and average) of the thermal pressure. Reviewed-by: NVincent Guittot <vincent.guittot@linaro.org> Signed-off-by: NThara Gopinath <thara.gopinath@linaro.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200222005213.3873-2-thara.gopinath@linaro.org
-
- 04 3月, 2020 1 次提交
-
-
由 Masami Hiramatsu 提交于
Show line and column when we got a parse error in bootconfig tool. Current lib/bootconfig shows the parse error with byte offset, but that is not human readable. This makes xbc_init() not showing error message itself but able to pass the error message and position to caller, so that the caller can decode it and show the error message with line number and columns. With this patch, bootconfig tool shows an error with line:column as below. $ cat samples/bad-dotword.bconf # do not start keyword with . key { .word = 1 } $ ./bootconfig -a samples/bad-dotword.bconf initrd Parse Error: Invalid keyword at 3:3 Link: http://lkml.kernel.org/r/158323469002.10560.4023923847704522760.stgit@devnote2Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 03 3月, 2020 2 次提交
-
-
由 Quentin Perret 提交于
CONFIG_TRIM_UNUSED_KSYMS currently removes all unused exported symbols from ksymtab. This works really well when using in-tree drivers, but cannot be used in its current form if some of them are out-of-tree. Indeed, even if the list of symbols required by out-of-tree drivers is known at compile time, the only solution today to guarantee these don't get trimmed is to set CONFIG_TRIM_UNUSED_KSYMS=n. This not only wastes space, but also makes it difficult to control the ABI usable by vendor modules in distribution kernels such as Android. Being able to control the kernel ABI surface is particularly useful to ship a unique Generic Kernel Image (GKI) for all vendors, which is a first step in the direction of getting all vendors to contribute their code upstream. As such, attempt to improve the situation by enabling users to specify a symbol 'whitelist' at compile time. Any symbol specified in this whitelist will be kept exported when CONFIG_TRIM_UNUSED_KSYMS is set, even if it has no in-tree user. The whitelist is defined as a simple text file, listing symbols, one per line. Acked-by: NJessica Yu <jeyu@kernel.org> Acked-by: NNicolas Pitre <nico@fluxnic.net> Tested-by: NMatthias Maennich <maennich@google.com> Reviewed-by: NMatthias Maennich <maennich@google.com> Signed-off-by: NQuentin Perret <qperret@google.com> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Masahiro Yamada 提交于
Most of the Kconfig commands (except defconfig and all*config) read the .config file as a base set of CONFIG options. When it does not exist, the files in DEFCONFIG_LIST are searched in this order and loaded if found. I do not see much sense in the last two lines in DEFCONFIG_LIST. [1] ARCH_DEFCONFIG The entry for DEFCONFIG_LIST is guarded by 'depends on !UML'. So, the ARCH_DEFCONFIG definition in arch/x86/um/Kconfig is meaningless. arch/{sh,sparc,x86}/Kconfig define ARCH_DEFCONFIG depending on 32 or 64 bit variant symbols. This is a little bit strange; ARCH_DEFCONFIG should be a fixed string because the base config file is loaded before the symbol evaluation stage. Using KBUILD_DEFCONFIG makes more sense because it is fixed before Kconfig is invoked. Fortunately, arch/{sh,sparc,x86}/Makefile define it in the same way, and it works as expected. Hence, replace ARCH_DEFCONFIG with "arch/$(SRCARCH)/configs/$(KBUILD_DEFCONFIG)". [2] arch/$(ARCH)/defconfig This file path is no longer valid. The defconfig files are always located in the arch configs/ directories. $ find arch -name defconfig | sort arch/alpha/configs/defconfig arch/arm64/configs/defconfig arch/csky/configs/defconfig arch/nds32/configs/defconfig arch/riscv/configs/defconfig arch/s390/configs/defconfig arch/unicore32/configs/defconfig The path arch/*/configs/defconfig is already covered by "arch/$(SRCARCH)/configs/$(KBUILD_DEFCONFIG)". So, this file path is not necessary. I moved the default KBUILD_DEFCONFIG to the top Makefile. Otherwise, the 7 architectures listed above would end up with endless loop of syncconfig. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 26 2月, 2020 1 次提交
-
-
由 Masami Hiramatsu 提交于
Since commit d8a953dd ("bootconfig: Set CONFIG_BOOT_CONFIG=n by default") also changed the CONFIG_BOOTTIME_TRACING to select CONFIG_BOOT_CONFIG to show the boot-time tracing on the menu, it introduced wrong dependencies with BLK_DEV_INITRD as below. WARNING: unmet direct dependencies detected for BOOT_CONFIG Depends on [n]: BLK_DEV_INITRD [=n] Selected by [y]: - BOOTTIME_TRACING [=y] && TRACING_SUPPORT [=y] && FTRACE [=y] && TRACING [=y] This makes the CONFIG_BOOT_CONFIG selects CONFIG_BLK_DEV_INITRD to fix this error and make CONFIG_BOOTTIME_TRACING=n by default, so that both boot-time tracing and boot configuration off but those appear on the menu list. Link: http://lkml.kernel.org/r/158264140162.23842.11237423518607465535.stgit@devnote2 Fixes: d8a953dd ("bootconfig: Set CONFIG_BOOT_CONFIG=n by default") Reported-by: NRandy Dunlap <rdunlap@infradead.org> Compiled-tested-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 21 2月, 2020 4 次提交
-
-
由 Masami Hiramatsu 提交于
Print arraied values as multiple same options for legacy kernel command line. With this rule, if the "kernel.*" and "init.*" array entries in bootconfig are printed out as multiple same options, e.g. kernel { console = "ttyS0,115200" console += "tty0" } will be correctly converted to console="ttyS0,115200" console="tty0" in the kernel command line. Link: http://lkml.kernel.org/r/158220118213.26565.8163300497009463916.stgit@devnote2Reported-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Add bootconfig magic word to the end of bootconfig on initrd image for indicating explicitly the bootconfig is there. Also tools/bootconfig treats wrong size or wrong checksum or parse error as an error, because if there is a bootconfig magic word, there must be a bootconfig. The bootconfig magic word is "#BOOTCONFIG\n", 12 bytes word. Thus the block image of the initrd file with bootconfig is as follows. [Initrd][bootconfig][size][csum][#BOOTCONFIG\n] Link: http://lkml.kernel.org/r/158220112263.26565.3944814205960612841.stgit@devnote2Suggested-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Set CONFIG_BOOT_CONFIG=n by default. This also warns user if CONFIG_BOOT_CONFIG=n but "bootconfig" is given in the kernel command line. Link: http://lkml.kernel.org/r/158220111291.26565.9036889083940367969.stgit@devnote2Suggested-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Qiujun Huang 提交于
In fact, this function is only used in this file, so mark it with 'static'. Link: http://lkml.kernel.org/r/1581852511-14163-1-git-send-email-hqjagain@gmail.comAcked-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NQiujun Huang <hqjagain@gmail.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 11 2月, 2020 2 次提交
-
-
由 Steven Rostedt (VMware) 提交于
The current implementation does a naive search of "bootconfig" on the kernel command line. But this could find "bootconfig" that is part of another option in quotes (although highly unlikely). But it also needs to find '--' on the kernel command line to know if it should append a '--' or not when a bootconfig in the initrd file has an "init" section. The check uses the naive strstr() to find to see if it exists. But this can return a false positive if it exists in an option and then the "init" section in the initrd will not be appended properly. Using parse_args() to find both of these will solve both of these problems. Link: https://lore.kernel.org/r/202002070954.C18E7F58B@keescook Fixes: 7495e092 ("bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline") Fixes: 13199162 ("bootconfig: init: Allow admin to use bootconfig for init command line") Reported-by: NKees Cook <keescook@chromium.org> Reviewed-by: NKees Cook <keescook@chromium.org> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Since there is no user except CONFIG_BOOT_CONFIG and no plan to use it from other functions, CONFIG_LIBXBC can be removed and we can use CONFIG_BOOT_CONFIG directly. Link: http://lkml.kernel.org/r/158098769281.939.16293492056419481105.stgit@devnote2Suggested-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 06 2月, 2020 2 次提交
-
-
由 Masami Hiramatsu 提交于
Show the number of bootconfig nodes on boot message. Link: http://lkml.kernel.org/r/158091062297.27924.9051634676068550285.stgit@devnote2Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Use "bootconfig" (1 word) instead of "boot config" (2 words) in the boot message. Link: http://lkml.kernel.org/r/158091059459.27924.14414336187441539879.stgit@devnote2Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 05 2月, 2020 1 次提交
-
-
由 Steven Rostedt (VMware) 提交于
As the bootconfig is appended to the initrd it is not as easy to modify as the kernel command line. If there's some issue with the kernel, and the developer wants to boot a pristine kernel, it should not be needed to modify the initrd to remove the bootconfig for a single boot. As bootconfig is silently added (if the admin does not know where to look they may not know it's being loaded). It should be explicitly added to the kernel cmdline. The loading of the bootconfig is only done if "bootconfig" is on the kernel command line. This will let admins know that the kernel command line is extended. Note, after adding printk()s for when the size is too great or the checksum is wrong, exposed that the current method always looked for the boot config, and if this size and checksum matched, it would parse it (as if either is wrong a printk has been added to show this). It's better to only check this if the boot config is asked to be looked for. Link: https://lore.kernel.org/r/CAHk-=wjfjO+h6bQzrTf=YCZA53Y3EDyAs3Z4gEsT7icA3u_Psw@mail.gmail.comAcked-by: NMasami Hiramatsu <mhiramat@kernel.org> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 01 2月, 2020 4 次提交
-
-
由 Christophe Leroy 提交于
This message leads to thinking that memory protection is not implemented for the said architecture, whereas absence of CONFIG_STRICT_KERNEL_RWX only means that memory protection has not been selected at compile time. Don't print this message when CONFIG_ARCH_HAS_STRICT_KERNEL_RWX is selected by the architecture. Instead, print "Kernel memory protection not selected by kernel config." Link: http://lkml.kernel.org/r/62477e446d9685459d4f27d193af6ff1bd69d55f.1578557581.git.christophe.leroy@c-s.frSigned-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Acked-by: NKees Cook <keescook@chromium.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Arvind Sankar 提交于
Patch series "init/main.c: minor cleanup/bugfix of envvar handling", v2. unknown_bootoption passes unrecognized command line arguments to init as either environment variables or arguments. Some of the logic in the function is broken for quoted command line arguments. When an argument of the form param="value" is processed by parse_args and passed to unknown_bootoption, the command line has param\0"value\0 with val pointing to the beginning of value. The helper function repair_env_string is then used to restore the '=' character that was removed by parse_args, and strip the quotes off fully. This results in param=value\0\0 and val ends up pointing to the 'a' instead of the 'v' in value. This bug was introduced when repair_env_string was refactored into a separate function, and the decrement of val in repair_env_string became dead code. This causes two problems in unknown_bootoption in the two places where the val pointer is used as a substitute for the length of param: 1. An argument of the form param=".value" is misinterpreted as a potential module parameter, with the result that it will not be placed in init's environment. 2. An argument of the form param="value" is checked to see if param is an existing environment variable that should be overwritten, but the comparison is off-by-one and compares 'param=v' instead of 'param=' against the existing environment. So passing, for example, TERM="vt100" on the command line results in init being passed both TERM=linux and TERM=vt100 in its environment. Patch 1 adds logging for the arguments and environment passed to init and is independent of the rest: it can be dropped if this is unnecessarily verbose. Patch 2 removes repair_env_string from initcall parameter parsing in do_initcall_level, as that uses a separate copy of the command line now and the repairing is no longer necessary. Patch 3 fixes the bug in unknown_bootoption by recording the length of param explicitly instead of implying it from val-param. This patch (of 3): Commit a99cd112 ("init: fix bug where environment vars can't be passed via boot args") introduced two minor bugs in unknown_bootoption by factoring out the quoted value handling into a separate function. When value is quoted, repair_env_string will move the value up 1 byte to strip the quotes, so val in unknown_bootoption no longer points to the actual location of the value. The result is that an argument of the form param=".value" is mistakenly treated as a potential module parameter and is not placed in init's environment, and an argument of the form param="value" can result in a duplicate environment variable: eg TERM="vt100" on the command line will result in both TERM=linux and TERM=vt100 being placed into init's environment. Fix this by recording the length of the param before calling repair_env_string instead of relying on val. Link: http://lkml.kernel.org/r/20191212180023.24339-4-nivedita@alum.mit.eduSigned-off-by: NArvind Sankar <nivedita@alum.mit.edu> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Arvind Sankar 提交于
Since commit 08746a65 ("init: fix in-place parameter modification regression"), parse_args in do_initcall_level is called on a copy of saved_command_line. It is unnecessary to call repair_env_string during this parsing, as this copy is not used for anything later. Remove the now unnecessary arguments from repair_env_string as well. Link: http://lkml.kernel.org/r/20191212180023.24339-3-nivedita@alum.mit.eduSigned-off-by: NArvind Sankar <nivedita@alum.mit.edu> Cc: Krzysztof Mazur <krzysiek@podlesie.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Arvind Sankar 提交于
Extend logging in `run_init_process` to also show the arguments and environment that we are passing to init. Link: http://lkml.kernel.org/r/20191212180023.24339-2-nivedita@alum.mit.eduSigned-off-by: NArvind Sankar <nivedita@alum.mit.edu> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 1月, 2020 1 次提交
-
-
由 Masami Hiramatsu 提交于
Fix Kconfig help message since the bootconfig file is only available to be appended to initramfs. And also add a reference to the documentation. Link: http://lkml.kernel.org/r/157949058031.25888.18399447161895787505.stgit@devnote2Reported-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 20 1月, 2020 1 次提交
-
-
由 Johannes Berg 提交于
This reverts commit 786b2384 ("um: Enable CONFIG_CONSTRUCTORS"). There are two issues with this commit, uncovered by Anton in tests on some (Debian) systems: 1) I completely forgot to call any constructors if CONFIG_CONSTRUCTORS isn't set. Don't recall now if it just wasn't needed on my system, or if I never tested this case. 2) With that fixed, it works - with CONFIG_CONSTRUCTORS *unset*. If I set CONFIG_CONSTRUCTORS, it fails again, which isn't totally unexpected since whatever wanted to run is likely to have to run before the kernel init etc. that calls the constructors in this case. Basically, some constructors that gcc emits (libc has?) need to run very early during init; the failure mode otherwise was that the ptrace fork test already failed: ---------------------- $ ./linux mem=512M Core dump limits : soft - 0 hard - NONE Checking that ptrace can change system call numbers...check_ptrace : child exited with exitcode 6, while expecting 0; status 0x67f Aborted ---------------------- Thinking more about this, it's clear that we simply cannot support CONFIG_CONSTRUCTORS in UML. All the cases we need now (gcov, kasan) involve not use of the __attribute__((constructor)), but instead some constructor code/entry generated by gcc. Therefore, we cannot distinguish between kernel constructors and system constructors. Thus, revert this commit. Cc: stable@vger.kernel.org [5.4+] Fixes: 786b2384 ("um: Enable CONFIG_CONSTRUCTORS") Reported-by: NAnton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Acked-by: NAnton Ivanov <anton.ivanov@cambridgegreys.co.uk> Signed-off-by: NRichard Weinberger <richard@nod.at>
-
- 14 1月, 2020 8 次提交
-
-
由 Thomas Gleixner 提交于
To support time namespaces in the vdso with a minimal impact on regular non time namespace affected tasks, the namespace handling needs to be hidden in a slow path. The most obvious place is vdso_seq_begin(). If a task belongs to a time namespace then the VVAR page which contains the system wide vdso data is replaced with a namespace specific page which has the same layout as the VVAR page. That page has vdso_data->seq set to 1 to enforce the slow path and vdso_data->clock_mode set to VCLOCK_TIMENS to enforce the time namespace handling path. The extra check in the case that vdso_data->seq is odd, e.g. a concurrent update of the vdso data is in progress, is not really affecting regular tasks which are not part of a time namespace as the task is spin waiting for the update to finish and vdso_data->seq to become even again. If a time namespace task hits that code path, it invokes the corresponding time getter function which retrieves the real VVAR page, reads host time and then adds the offset for the requested clock which is stored in the special VVAR page. If VDSO time namespace support is disabled the whole magic is compiled out. Initial testing shows that the disabled case is almost identical to the host case which does not take the slow timens path. With the special timens page installed the performance hit is constant time and in the range of 5-7%. For the vdso functions which are not using the sequence count an unconditional check for vdso_data->clock_mode is added which switches to the real vdso when the clock_mode is VCLOCK_TIMENS. [avagin: Make do_hres_timens() work with raw clocks too: choose vdso_data pointer by CS_RAW offset.] Suggested-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrei Vagin <avagin@gmail.com> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-21-dima@arista.com
-
由 Andrei Vagin 提交于
Time Namespace isolates clock values. The kernel provides access to several clocks CLOCK_REALTIME, CLOCK_MONOTONIC, CLOCK_BOOTTIME, etc. CLOCK_REALTIME System-wide clock that measures real (i.e., wall-clock) time. CLOCK_MONOTONIC Clock that cannot be set and represents monotonic time since some unspecified starting point. CLOCK_BOOTTIME Identical to CLOCK_MONOTONIC, except it also includes any time that the system is suspended. For many users, the time namespace means the ability to changes date and time in a container (CLOCK_REALTIME). Providing per namespace notions of CLOCK_REALTIME would be complex with a massive overhead, but has a dubious value. But in the context of checkpoint/restore functionality, monotonic and boottime clocks become interesting. Both clocks are monotonic with unspecified starting points. These clocks are widely used to measure time slices and set timers. After restoring or migrating processes, it has to be guaranteed that they never go backward. In an ideal case, the behavior of these clocks should be the same as for a case when a whole system is suspended. All this means that it is required to set CLOCK_MONOTONIC and CLOCK_BOOTTIME clocks, which can be achieved by adding per-namespace offsets for clocks. A time namespace is similar to a pid namespace in the way how it is created: unshare(CLONE_NEWTIME) system call creates a new time namespace, but doesn't set it to the current process. Then all children of the process will be born in the new time namespace, or a process can use the setns() system call to join a namespace. This scheme allows setting clock offsets for a namespace, before any processes appear in it. All available clone flags have been used, so CLONE_NEWTIME uses the highest bit of CSIGNAL. It means that it can be used only with the unshare() and the clone3() system calls. [ tglx: Adjusted paragraph about clone3() to reality and massaged the changelog a bit. ] Co-developed-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NAndrei Vagin <avagin@gmail.com> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://criu.org/Time_namespace Link: https://lists.openvz.org/pipermail/criu/2018-June/041504.html Link: https://lore.kernel.org/r/20191112012724.250792-4-dima@arista.com
-
由 Vlastimil Babka 提交于
Commit 96a2b03f ("mm, debug_pagelloc: use static keys to enable debugging") has introduced a static key to reduce overhead when debug_pagealloc is compiled in but not enabled. It relied on the assumption that jump_label_init() is called before parse_early_param() as in start_kernel(), so when the "debug_pagealloc=on" option is parsed, it is safe to enable the static key. However, it turns out multiple architectures call parse_early_param() earlier from their setup_arch(). x86 also calls jump_label_init() even earlier, so no issue was found while testing the commit, but same is not true for e.g. ppc64 and s390 where the kernel would not boot with debug_pagealloc=on as found by our QA. To fix this without tricky changes to init code of multiple architectures, this patch partially reverts the static key conversion from 96a2b03f. Init-time and non-fastpath calls (such as in arch code) of debug_pagealloc_enabled() will again test a simple bool variable. Fastpath mm code is converted to a new debug_pagealloc_enabled_static() variant that relies on the static key, which is enabled in a well-defined point in mm_init() where it's guaranteed that jump_label_init() has been called, regardless of architecture. [sfr@canb.auug.org.au: export _debug_pagealloc_enabled_early] Link: http://lkml.kernel.org/r/20200106164944.063ac07b@canb.auug.org.au Link: http://lkml.kernel.org/r/20191219130612.23171-1-vbabka@suse.cz Fixes: 96a2b03f ("mm, debug_pagelloc: use static keys to enable debugging") Signed-off-by: NVlastimil Babka <vbabka@suse.cz> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masami Hiramatsu 提交于
Since the current kernel command line is too short to describe long and many options for init (e.g. systemd command line options), this allows admin to use boot config for init command line. All init command line under "init." keywords will be passed to init. For example, init.systemd { unified_cgroup_hierarchy = 1 debug_shell default_timeout_start_sec = 60 } Link: http://lkml.kernel.org/r/157867229521.17873.654222294326542349.stgit@devnote2Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Since the current kernel command line is too short to describe many options which supported by kernel, allow user to use boot config to setup (add) the command line options. All kernel parameters under "kernel." keywords will be used for setting up extra kernel command line. For example, kernel { audit = on audit_backlog_limit = 256 } Note that you can not specify some early parameters (like console etc.) by this method, since it is loaded after early parameters parsed. Link: http://lkml.kernel.org/r/157867228333.17873.11962796367032622466.stgit@devnote2Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Since initcall_command_line is used as a temporary buffer, it could be freed after usage. Allocate it in do_initcall() and free it after used. Link: http://lkml.kernel.org/r/157867227145.17873.17513760552008505454.stgit@devnote2Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Load the extended boot config data from the tail of initrd image. If there is an SKC data there, it has [(u32)size][(u32)checksum] header (in really, this is a footer) at the end of initrd. If the checksum (simple sum of bytes) is match, this starts parsing it from there. Link: http://lkml.kernel.org/r/157867222435.17873.9936667353335606867.stgit@devnote2Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Masami Hiramatsu 提交于
Extra Boot Config (XBC) allows admin to pass a tree-structured boot configuration file when boot up the kernel. This extends the kernel command line in an efficient way. Boot config will contain some key-value commands, e.g. key.word = value1 another.key.word = value2 It can fold same keys with braces, also you can write array data. For example, key { word1 { setting1 = data setting2 } word2.array = "val1", "val2" } User can access these key-value pair and tree structure via SKC APIs. Link: http://lkml.kernel.org/r/157867221257.17873.1775090991929862549.stgit@devnote2Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 03 1月, 2020 1 次提交
-
-
由 Dominik Brodowski 提交于
This reverts commit 8243186f ("fs: remove ksys_dup()") and the subsequent fix for it in commit 2d3145f8 ("early init: fix error handling when opening /dev/console"). Trying to use filp_open() and f_dupfd() instead of pseudo-syscalls caused more trouble than what is worth it: it requires accessing vfs internals and it turns out there were other bugs in it too. In particular, the file reference counting was wrong - because unlike the original "open+2*dup" sequence it used "filp_open+3*f_dupfd" and thus had an extra leaked file reference. That in turn then caused odd problems with Androidx86 long after boot becaue of how the extra reference to the console kept the session active even after all file descriptors had been closed. Reported-by: Nyouling 257 <youling257@gmail.com> Cc: Arvind Sankar <nivedita@alum.mit.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 12月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
The comment says "this should never fail", but it definitely can fail when you have odd initial boot filesystems, or kernel configurations. So get the error handling right: filp_open() returns an error pointer. Reported-by: NJesse Barnes <jsbarnes@google.com> Reported-by: Nyouling 257 <youling257@gmail.com> Fixes: 8243186f ("fs: remove ksys_dup()") Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 12月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
The "trivial conversion" in commit cccaa5e3 ("init: use do_mount() instead of ksys_mount()") was totally broken, since it didn't handle the case of a NULL mount data pointer. And while I had "tested" it (and presumably Dominik had too) that bug was hidden by me having options. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Arnd Bergmann <arnd@arndb.de> Reported-by: NOndřej Jirman <megi@xff.cz> Reported-by: NGuenter Roeck <linux@roeck-us.net> Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Reported-and-tested-by: NBorislav Petkov <bp@suse.de> Tested-by: NChris Clayton <chris2553@googlemail.com> Tested-by: NEric Biggers <ebiggers@kernel.org> Tested-by: NGeert Uytterhoeven <geert@linux-m68k.org> Tested-by: NGuido Günther <agx@sigxcpu.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 12月, 2019 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
device.h has everything and the kitchen sink when it comes to struct device things, so split out the struct driver things things to a separate .h file to make things easier to maintain and manage over time. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Saravana Kannan <saravanak@google.com> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20191209193303.1694546-7-gregkh@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 13 12月, 2019 2 次提交
-
-
由 Shile Zhang 提交于
Use a more generic name for additional table sorting usecases, such as the upcoming ORC table sorting feature. This tool is not tied to exception table sorting anymore. No functional changes intended. [ mingo: Rewrote the changelog. ] Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: linux-kbuild@vger.kernel.org Link: https://lkml.kernel.org/r/20191204004633.88660-6-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dominik Brodowski 提交于
ksys_dup() is used only at one place in the kernel, namely to duplicate fd 0 of /dev/console to stdout and stderr. The same functionality can be achieved by using functions already available within the kernel namespace. Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
-