- 24 1月, 2023 3 次提交
-
-
由 Kees Cook 提交于
commit 9fc9e278 upstream. Like oops_limit, add warn_limit for limiting the number of warnings when panic_on_warn is not set. Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Petr Mladek <pmladek@suse.com> Cc: tangmeng <tangmeng@uniontech.com> Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-doc@vger.kernel.org Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221117234328.594699-5-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Kees Cook 提交于
commit de92f657 upstream. In preparation for keeping oops_limit logic in sync with warn_limit, have oops_limit == 0 disable checking the Oops counter. Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Huang Ying <ying.huang@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-doc@vger.kernel.org Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Jann Horn 提交于
commit d4ccd54d upstream. Many Linux systems are configured to not panic on oops; but allowing an attacker to oops the system **really** often can make even bugs that look completely unexploitable exploitable (like NULL dereferences and such) if each crash elevates a refcount by one or a lock is taken in read mode, and this causes a counter to eventually overflow. The most interesting counters for this are 32 bits wide (like open-coded refcounts that don't use refcount_t). (The ldsem reader count on 32-bit platforms is just 16 bits, but probably nobody cares about 32-bit platforms that much nowadays.) So let's panic the system if the kernel is constantly oopsing. The speed of oopsing 2^32 times probably depends on several factors, like how long the stack trace is and which unwinder you're using; an empirically important one is whether your console is showing a graphical environment or a text console that oopses will be printed to. In a quick single-threaded benchmark, it looks like oopsing in a vfork() child with a very short stack trace only takes ~510 microseconds per run when a graphical console is active; but switching to a text console that oopses are printed to slows it down around 87x, to ~45 milliseconds per run. (Adding more threads makes this faster, but the actual oops printing happens under &die_lock on x86, so you can maybe speed this up by a factor of around 2 and then any further improvement gets eaten up by lock contention.) It looks like it would take around 8-12 days to overflow a 32-bit counter with repeated oopsing on a multi-core X86 system running a graphical environment; both me (in an X86 VM) and Seth (with a distro kernel on normal hardware in a standard configuration) got numbers in that ballpark. 12 days aren't *that* short on a desktop system, and you'd likely need much longer on a typical server system (assuming that people don't run graphical desktop environments on their servers), and this is a *very* noisy and violent approach to exploiting the kernel; and it also seems to take orders of magnitude longer on some machines, probably because stuff like EFI pstore will slow it down a ton if that's active. Signed-off-by: NJann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@google.comReviewed-by: NLuis Chamberlain <mcgrof@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221117234328.594699-2-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 31 12月, 2022 1 次提交
-
-
由 Guilherme G. Piccoli 提交于
[ Upstream commit 72720937 ] Commit b041b525 ("x86/split_lock: Make life miserable for split lockers") changed the way the split lock detector works when in "warn" mode; basically, it not only shows the warn message, but also intentionally introduces a slowdown through sleeping plus serialization mechanism on such task. Based on discussions in [0], seems the warning alone wasn't enough motivation for userspace developers to fix their applications. This slowdown is enough to totally break some proprietary (aka. unfixable) userspace[1]. Happens that originally the proposal in [0] was to add a new mode which would warns + slowdown the "split locking" task, keeping the old warn mode untouched. In the end, that idea was discarded and the regular/default "warn" mode now slows down the applications. This is quite aggressive with regards proprietary/legacy programs that basically are unable to properly run in kernel with this change. While it is understandable that a malicious application could DoS by split locking, it seems unacceptable to regress old/proprietary userspace programs through a default configuration that previously worked. An example of such breakage was reported in [1]. Add a sysctl to allow controlling the "misery mode" behavior, as per Thomas suggestion on [2]. This way, users running legacy and/or proprietary software are allowed to still execute them with a decent performance while still observing the warning messages on kernel log. [0] https://lore.kernel.org/lkml/20220217012721.9694-1-tony.luck@intel.com/ [1] https://github.com/doitsujin/dxvk/issues/2938 [2] https://lore.kernel.org/lkml/87pmf4bter.ffs@tglx/ [ dhansen: minor changelog tweaks, including clarifying the actual problem ] Fixes: b041b525 ("x86/split_lock: Make life miserable for split lockers") Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGuilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NTony Luck <tony.luck@intel.com> Tested-by: NAndre Almeida <andrealmeid@igalia.com> Link: https://lore.kernel.org/all/20221024200254.635256-1-gpiccoli%40igalia.comSigned-off-by: NSasha Levin <sashal@kernel.org>
-
- 12 9月, 2022 2 次提交
-
-
由 Petr Vorel 提交于
Print the machine hardware name (UTS_MACHINE) in /proc/sys/kernel/arch. This helps people who debug kernel with initramfs with minimal environment (i.e. without coreutils or even busybox) or allow to open sysfs file instead of run 'uname -m' in high level languages. Link: https://lkml.kernel.org/r/20220901194403.3819-1-pvorel@suse.czSigned-off-by: NPetr Vorel <pvorel@suse.cz> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: David Sterba <dsterba@suse.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Huang Ying 提交于
In NUMA balancing memory tiering mode, if there are hot pages in slow memory node and cold pages in fast memory node, we need to promote/demote hot/cold pages between the fast and cold memory nodes. A choice is to promote/demote as fast as possible. But the CPU cycles and memory bandwidth consumed by the high promoting/demoting throughput will hurt the latency of some workload because of accessing inflating and slow memory bandwidth contention. A way to resolve this issue is to restrict the max promoting/demoting throughput. It will take longer to finish the promoting/demoting. But the workload latency will be better. This is implemented in this patch as the page promotion rate limit mechanism. The number of the candidate pages to be promoted to the fast memory node via NUMA balancing is counted, if the count exceeds the limit specified by the users, the NUMA balancing promotion will be stopped until the next second. A new sysctl knob kernel.numa_balancing_promote_rate_limit_MBps is added for the users to specify the limit. Link: https://lkml.kernel.org/r/20220713083954.34196-3-ying.huang@intel.comSigned-off-by: N"Huang, Ying" <ying.huang@intel.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Tested-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: osalvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Wei Xu <weixugc@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Zhong Jiang <zhongjiang-ali@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 27 7月, 2022 1 次提交
-
-
由 Laurent Dufour 提交于
During an LPM, while the memory transfer is in progress on the arrival side, some latencies are generated when accessing not yet transferred pages on the arrival side. Thus, the NMI watchdog may be triggered too frequently, which increases the risk to hit an NMI interrupt in a bad place in the kernel, leading to a kernel panic. Disabling the Hard Lockup Watchdog until the memory transfer could be a too strong work around, some users would want this timeout to be eventually triggered if the system is hanging even during an LPM. Introduce a new sysctl variable nmi_watchdog_factor. It allows to apply a factor to the NMI watchdog timeout during an LPM. Just before the CPUs are stopped for the switchover sequence, the NMI watchdog timer is set to watchdog_thresh + factor% A value of 0 has no effect. The default value is 200, meaning that the NMI watchdog is set to 30s during LPM (based on a 10s watchdog_thresh value). Once the memory transfer is achieved, the factor is reset to 0. Setting this value to a high number is like disabling the NMI watchdog during an LPM. Signed-off-by: NLaurent Dufour <ldufour@linux.ibm.com> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220713154729.80789-5-ldufour@linux.ibm.com
-
- 25 6月, 2022 1 次提交
-
-
由 Stephen Kitt 提交于
Text in ``literal`` markup must be separated by word separators, so text like ``lowwater``% renders incorrectly. Add the suggested "\ " after two problematic occurrences. Signed-off-by: NStephen Kitt <steve@sk2.org> Link: https://lore.kernel.org/r/20220624110230.595740-1-steve@sk2.org [jc: tweaked to use "\ "] Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 14 5月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
A semicolon was missing, and the almost-alphabetical-but-not ordering was confusing, so regroup these by category instead. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 02 5月, 2022 1 次提交
-
-
由 Joel Savitz 提交于
commit dfe56404 ("rcu: Panic after fixed number of stalls") introduced a new systctl but no accompanying documentation. Add a simple entry to the documentation. Signed-off-by: NJoel Savitz <jsavitz@redhat.com> Acked-by: NPaul E. McKenney <paulmck@kernel.org> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 24 3月, 2022 2 次提交
-
-
由 Guilherme G. Piccoli 提交于
Currently the "panic_print" parameter/sysctl allows some interesting debug information to be printed during a panic event. This is useful for example in cases the user cannot kdump due to resource limits, or if the user collects panic logs in a serial output (or pstore) and prefers a fast reboot instead of a kdump. Happens that currently there's no way to see all CPUs backtraces in a panic using "panic_print" on architectures that support that. We do have "oops_all_cpu_backtrace" sysctl, but although partially overlapping in the functionality, they are orthogonal in nature: "panic_print" is a panic tuning (and we have panics without oopses, like direct calls to panic() or maybe other paths that don't go through oops_enter() function), and the original purpose of "oops_all_cpu_backtrace" is to provide more information on oopses for cases in which the users desire to continue running the kernel even after an oops, i.e., used in non-panic scenarios. So, we hereby introduce an additional bit for "panic_print" to allow dumping the CPUs backtraces during a panic event. Link: https://lkml.kernel.org/r/20211109202848.610874-3-gpiccoli@igalia.comSigned-off-by: NGuilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: NFeng Tang <feng.tang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Guilherme G. Piccoli 提交于
Patch series "Some improvements on panic_print". This is a mix of a documentation fix with some additions to the "panic_print" syscall / parameter. The goal here is being able to collect all CPUs backtraces during a panic event and also to enable "panic_print" in a kdump event - details of the reasoning and design choices in the patches. This patch (of 3): Commit de6da1e8 ("panic: add an option to replay all the printk message in buffer") added a new bit to the sysctl/kernel parameter "panic_print", but the documentation was added only in kernel-parameters.txt, not in the sysctl guide. Fix it here by adding bit 5 to sysctl admin-guide documentation. [rdunlap@infradead.org: fix table format warning] Link: https://lkml.kernel.org/r/20220109055635.6999-1-rdunlap@infradead.org Link: https://lkml.kernel.org/r/20211109202848.610874-1-gpiccoli@igalia.com Link: https://lkml.kernel.org/r/20211109202848.610874-2-gpiccoli@igalia.com Fixes: de6da1e8 ("panic: add an option to replay all the printk message in buffer") Signed-off-by: NGuilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: NFeng Tang <feng.tang@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 3月, 2022 1 次提交
-
-
由 Huang Ying 提交于
With the advent of various new memory types, some machines will have multiple types of memory, e.g. DRAM and PMEM (persistent memory). The memory subsystem of these machines can be called memory tiering system, because the performance of the different types of memory are usually different. In such system, because of the memory accessing pattern changing etc, some pages in the slow memory may become hot globally. So in this patch, the NUMA balancing mechanism is enhanced to optimize the page placement among the different memory types according to hot/cold dynamically. In a typical memory tiering system, there are CPUs, fast memory and slow memory in each physical NUMA node. The CPUs and the fast memory will be put in one logical node (called fast memory node), while the slow memory will be put in another (faked) logical node (called slow memory node). That is, the fast memory is regarded as local while the slow memory is regarded as remote. So it's possible for the recently accessed pages in the slow memory node to be promoted to the fast memory node via the existing NUMA balancing mechanism. The original NUMA balancing mechanism will stop to migrate pages if the free memory of the target node becomes below the high watermark. This is a reasonable policy if there's only one memory type. But this makes the original NUMA balancing mechanism almost do not work to optimize page placement among different memory types. Details are as follows. It's the common cases that the working-set size of the workload is larger than the size of the fast memory nodes. Otherwise, it's unnecessary to use the slow memory at all. So, there are almost always no enough free pages in the fast memory nodes, so that the globally hot pages in the slow memory node cannot be promoted to the fast memory node. To solve the issue, we have 2 choices as follows, a. Ignore the free pages watermark checking when promoting hot pages from the slow memory node to the fast memory node. This will create some memory pressure in the fast memory node, thus trigger the memory reclaiming. So that, the cold pages in the fast memory node will be demoted to the slow memory node. b. Define a new watermark called wmark_promo which is higher than wmark_high, and have kswapd reclaiming pages until free pages reach such watermark. The scenario is as follows: when we want to promote hot-pages from a slow memory to a fast memory, but fast memory's free pages would go lower than high watermark with such promotion, we wake up kswapd with wmark_promo watermark in order to demote cold pages and free us up some space. So, next time we want to promote hot-pages we might have a chance of doing so. The choice "a" may create high memory pressure in the fast memory node. If the memory pressure of the workload is high, the memory pressure may become so high that the memory allocation latency of the workload is influenced, e.g. the direct reclaiming may be triggered. The choice "b" works much better at this aspect. If the memory pressure of the workload is high, the hot pages promotion will stop earlier because its allocation watermark is higher than that of the normal memory allocation. So in this patch, choice "b" is implemented. A new zone watermark (WMARK_PROMO) is added. Which is larger than the high watermark and can be controlled via watermark_scale_factor. In addition to the original page placement optimization among sockets, the NUMA balancing mechanism is extended to be used to optimize page placement according to hot/cold among different memory types. So the sysctl user space interface (numa_balancing) is extended in a backward compatible way as follow, so that the users can enable/disable these functionality individually. The sysctl is converted from a Boolean value to a bits field. The definition of the flags is, - 0: NUMA_BALANCING_DISABLED - 1: NUMA_BALANCING_NORMAL - 2: NUMA_BALANCING_MEMORY_TIERING We have tested the patch with the pmbench memory accessing benchmark with the 80:20 read/write ratio and the Gauss access address distribution on a 2 socket Intel server with Optane DC Persistent Memory Model. The test results shows that the pmbench score can improve up to 95.9%. Thanks Andrew Morton to help fix the document format error. Link: https://lkml.kernel.org/r/20220221084529.1052339-3-ying.huang@intel.comSigned-off-by: N"Huang, Ying" <ying.huang@intel.com> Tested-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Reviewed-by: NOscar Salvador <osalvador@suse.de> Reviewed-by: NYang Shi <shy828301@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Rik van Riel <riel@surriel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Wei Xu <weixugc@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: zhongjiang-ali <zhongjiang-ali@linux.alibaba.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Feng Tang <feng.tang@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 2月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
With tools like kbench9000 giving more finegrained responses, and this basically never having been used ever since it was initially added, let's just get rid of this. There *is* still work to be done on the interrupt handler, but this really isn't the way it's being developed. Cc: Theodore Ts'o <tytso@mit.edu> Reviewed-by: NEric Biggers <ebiggers@google.com> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 21 2月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
Now that POOL_BITS == POOL_MIN_BITS, we must unconditionally wake up entropy writers after every extraction. Therefore there's no point of write_wakeup_threshold, so we can move it to the dustbin of unused compatibility sysctls. While we're at it, we can fix a small comparison where we were waking up after <= min rather than < min. Cc: Theodore Ts'o <tytso@mit.edu> Suggested-by: NEric Biggers <ebiggers@kernel.org> Reviewed-by: NEric Biggers <ebiggers@google.com> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 12 2月, 2022 1 次提交
-
-
由 Huang Ying 提交于
After commit 8a99b683 ("sched: Move SCHED_DEBUG sysctl to debugfs"), some NUMA balancing sysctls enclosed with SCHED_DEBUG has been moved to debugfs. This patch move the document for these sysctls from Documentation/admin-guide/sysctl/kernel.rst to Documentation/scheduler/sched-debug.rst to make the document consistent with the code. Signed-off-by: N"Huang, Ying" <ying.huang@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NValentin Schneider <valentin.schneider@arm.com> Acked-by: NMel Gorman <mgorman@techsingularity.net> Link: https://lkml.kernel.org/r/20220210052514.3038279-1-ying.huang@intel.com
-
- 14 12月, 2021 1 次提交
-
-
由 Rob Herring 提交于
Like x86, some users may want to disable userspace PMU counter altogether. Add a sysctl 'perf_user_access' file to control userspace counter access. The default is '0' which is disabled. Writing '1' enables access. Note that x86 supports globally enabling user access by writing '2' to /sys/bus/event_source/devices/cpu/rdpmc. As there's not existing userspace support to worry about, this shouldn't be necessary for Arm. It could be added later if the need arises. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-perf-users@vger.kernel.org Acked-by: NWill Deacon <will@kernel.org> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NRob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211208201124.310740-4-robh@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
-
- 17 11月, 2021 1 次提交
-
-
由 Mauro Carvalho Chehab 提交于
The file name: accounting/delay-accounting.rst should be, instead: Documentation/accounting/delay-accounting.rst. Also, there's no need to use doc:`foo`, as automarkup.py will automatically handle plain text mentions to Documentation/ files. So, update its cross-reference accordingly. Fixes: fcb50170 ("delayacct: Document task_delayacct sysctl") Fixes: c3123552 ("docs: accounting: convert to ReST") Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 30 6月, 2021 1 次提交
-
-
由 Wang Qing 提交于
"watchdog/%u" threads has be replaced by cpu_stop_work. The current description is extremely misleading. Link: https://lkml.kernel.org/r/1619687073-24686-5-git-send-email-wangqing@vivo.comSigned-off-by: NWang Qing <wangqing@vivo.com> Reviewed-by: NPetr Mladek <pmladek@suse.com> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Qais Yousef <qais.yousef@arm.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Santosh Sivaraj <santosh@fossix.org> Cc: Stephen Kitt <steve@sk2.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 6月, 2021 1 次提交
-
-
由 Mauro Carvalho Chehab 提交于
The :doc:`foo` tag is auto-generated via automarkup.py. So, use the filename at the sources, instead of :doc:`foo`. Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/12abd2290c7ebc05c89178d2556bea740bd70fac.1623824363.git.mchehab+huawei@kernel.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 18 5月, 2021 1 次提交
-
-
由 Mel Gorman 提交于
Update sysctl/kernel.rst. Signed-off-by: NMel Gorman <mgorman@suse.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210512114035.GH3672@suse.de
-
- 15 5月, 2021 1 次提交
-
-
由 Rasmus Villemoes 提交于
When I added CONFIG_MODPROBE_PATH, I neglected to update Documentation/. It's still true that this defaults to /sbin/modprobe, but now via a level of indirection. So document that the kernel might have been built with something other than /sbin/modprobe as the initial value. Link: https://lkml.kernel.org/r/20210420125324.1246826-1-linux@rasmusvillemoes.dk Fixes: 17652f42 ("modules: add CONFIG_MODPROBE_PATH") Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 5月, 2021 1 次提交
-
-
由 Rasmus Villemoes 提交于
It's been a few releases since this defaulted to /sbin/hotplug. Update the text, and include pointers to the two CONFIG_UEVENT_HELPER{,_PATH} config knobs whose help text could provide more info, but also hint that the user probably doesn't need to care at all. Fixes: 7934779a ("Driver-Core: disable /sbin/hotplug by default") Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20210420120638.1104016-1-linux@rasmusvillemoes.dkSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 12 5月, 2021 1 次提交
-
-
由 Daniel Borkmann 提交于
Add a kconfig knob which allows for unprivileged bpf to be disabled by default. If set, the knob sets /proc/sys/kernel/unprivileged_bpf_disabled to value of 2. This still allows a transition of 2 -> {0,1} through an admin. Similarly, this also still keeps 1 -> {1} behavior intact, so that once set to permanently disabled, it cannot be undone aside from a reboot. We've also added extra2 with max of 2 for the procfs handler, so that an admin still has a chance to toggle between 0 <-> 2. Either way, as an additional alternative, applications can make use of CAP_BPF that we added a while ago. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.1620765074.git.daniel@iogearbox.net
-
- 09 12月, 2020 3 次提交
-
-
Here's a patch updating the meaning of TAINT_CPU_OUT_OF_SPEC after Borislav introduced changes in a7e1f67e and upcoming patches in tip. TAINT_CPU_OUT_OF_SPEC now means a bit more what it implies as the flag isn't set just because of a CPU misconfiguration or mismatch. Historically it was for SMP kernel oops on an officially SMP incapable processor but now it also covers CPUs whose MSRs have been incorrectly poked at from userspace, drivers being used on non supported architectures, broken firmware, mismatched CPUs, ... Update documentation and script to reflect that. Signed-off-by: NMathieu Chouquet-Stringer <me@mathieu.digital> Link: https://lore.kernel.org/r/20201202153244.709752-1-me@mathieu.digitalSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Andrew Klychkov 提交于
Fix thirty five typos in dm-integrity.rst, dm-raid.rst, dm-zoned.rst, verity.rst, writecache.rst, tsx_async_abort.rst, md.rst, bttv.rst, dvb_references.rst, frontend-cardlist.rst, gspca-cardlist.rst, ipu3.rst, remote-controller.rst, mm/index.rst, numaperf.rst, userfaultfd.rst, module-signing.rst, imx-ddr.rst, intel-speed-select.rst, intel_pstate.rst, ramoops.rst, abi.rst, kernel.rst, vm.rst Signed-off-by: NAndrew Klychkov <andrew.a.klychkov@gmail.com> Link: https://lore.kernel.org/r/20201204072848.GA49895@spblnx124.lanSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Stephen Kitt 提交于
This cleans up a few titles with extra colons, and removes the reference to kernel 2.2. The docs don't yet cover *all* of 5.10 or 5.11, but I think they're close enough. Most entries are documented, and have been checked against current kernels. Signed-off-by: NStephen Kitt <steve@sk2.org> Link: https://lore.kernel.org/r/20201208074922.30359-1-steve@sk2.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 13 8月, 2020 1 次提交
-
-
由 Lepton Wu 提交于
The document reads "%e" should be "executable filename" while actually it could be changed by things like pr_ctl PR_SET_NAME. People who uses "%e" in core_pattern get surprised when they find out they get thread name instead of executable filename. This is either a bug of document or a bug of code. Since the behavior of "%e" is there for long time, it could bring another surprise for users if we "fix" the code. So we just "fix" the document. And more, for users who really need the "executable filename" in core_pattern, we introduce a new "%f" for the real executable filename. We already have "%E" for executable path in kernel, so just reuse most of its code for the new added "%f" format. Signed-off-by: NLepton Wu <ytht.net@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200701031432.2978761-1-ytht.net@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 7月, 2020 1 次提交
-
-
由 Qais Yousef 提交于
Uclamp exposes 3 sysctl knobs: * sched_util_clamp_min * sched_util_clamp_max * sched_util_clamp_min_rt_default Document them in sysctl/kernel.rst. Signed-off-by: NQais Yousef <qais.yousef@arm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200716110347.19553-3-qais.yousef@arm.com
-
- 06 7月, 2020 1 次提交
-
-
由 Randy Dunlap 提交于
Drop the doubled word "set". Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Link: https://lore.kernel.org/r/20200704032020.21923-12-rdunlap@infradead.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 27 6月, 2020 1 次提交
-
-
由 Stephen Kitt 提交于
This documents the random directory, based on the behaviour seen in drivers/char/random.c. Signed-off-by: NStephen Kitt <steve@sk2.org> Link: https://lore.kernel.org/r/20200623112514.10650-1-steve@sk2.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 20 6月, 2020 1 次提交
-
-
由 Randy Dunlap 提交于
Fix heading format warnings in admin-guide/sysctl/kernel.rst: Documentation/admin-guide/sysctl/kernel.rst:339: WARNING: Title underline too short. hung_task_all_cpu_backtrace: ================ Documentation/admin-guide/sysctl/kernel.rst:650: WARNING: Title underline too short. oops_all_cpu_backtrace: ================ Fixes: 0ec9dc9b ("kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected") Fixes: 60c958d8 ("panic: add sysctl to dump all CPUs backtraces on oops event") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Reviewed-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/8af1cb77-4b5a-64b9-da5d-f6a95e537f99@infradead.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 09 6月, 2020 3 次提交
-
-
由 Guilherme G. Piccoli 提交于
Usually when the kernel reaches an oops condition, it's a point of no return; in case not enough debug information is available in the kernel splat, one of the last resorts would be to collect a kernel crash dump and analyze it. The problem with this approach is that in order to collect the dump, a panic is required (to kexec-load the crash kernel). When in an environment of multiple virtual machines, users may prefer to try living with the oops, at least until being able to properly shutdown their VMs / finish their important tasks. This patch implements a way to collect a bit more debug details when an oops event is reached, by printing all the CPUs backtraces through the usage of NMIs (on architectures that support that). The sysctl added (and documented) here was called "oops_all_cpu_backtrace", and when set will (as the name suggests) dump all CPUs backtraces. Far from ideal, this may be the last option though for users that for some reason cannot panic on oops. Most of times oopses are clear enough to indicate the kernel portion that must be investigated, but in virtual environments it's possible to observe hypervisor/KVM issues that could lead to oopses shown in other guests CPUs (like virtual APIC crashes). This patch hence aims to help debug such complex issues without resorting to kdump. Signed-off-by: NGuilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NKees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200327224116.21030-1-gpiccoli@canonical.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Guilherme G. Piccoli 提交于
Commit 401c636a ("kernel/hung_task.c: show all hung tasks before panic") introduced a change in that we started to show all CPUs backtraces when a hung task is detected _and_ the sysctl/kernel parameter "hung_task_panic" is set. The idea is good, because usually when observing deadlocks (that may lead to hung tasks), the culprit is another task holding a lock and not necessarily the task detected as hung. The problem with this approach is that dumping backtraces is a slightly expensive task, specially printing that on console (and specially in many CPU machines, as servers commonly found nowadays). So, users that plan to collect a kdump to investigate the hung tasks and narrow down the deadlock definitely don't need the CPUs backtrace on dmesg/console, which will delay the panic and pollute the log (crash tool would easily grab all CPUs traces with 'bt -a' command). Also, there's the reciprocal scenario: some users may be interested in seeing the CPUs backtraces but not have the system panic when a hung task is detected. The current approach hence is almost as embedding a policy in the kernel, by forcing the CPUs backtraces' dump (only) on hung_task_panic. This patch decouples the panic event on hung task from the CPUs backtraces dump, by creating (and documenting) a new sysctl called "hung_task_all_cpu_backtrace", analog to the approach taken on soft/hard lockups, that have both a panic and an "all_cpu_backtrace" sysctl to allow individual control. The new mechanism for dumping the CPUs backtraces on hung task detection respects "hung_task_warnings" by not dumping the traces in case there's no warnings left. Signed-off-by: NGuilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NKees Cook <keescook@chromium.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: http://lkml.kernel.org/r/20200327223646.20779-1-gpiccoli@canonical.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rafael Aquini 提交于
Analogously to the introduction of panic_on_warn, this patch introduces a kernel option named panic_on_taint in order to provide a simple and generic way to stop execution and catch a coredump when the kernel gets tainted by any given flag. This is useful for debugging sessions as it avoids having to rebuild the kernel to explicitly add calls to panic() into the code sites that introduce the taint flags of interest. For instance, if one is interested in proceeding with a post-mortem analysis at the point a given code path is hitting a bad page (i.e. unaccount_page_cache_page(), or slab_bug()), a coredump can be collected by rebooting the kernel with 'panic_on_taint=0x20' amended to the command line. Another, perhaps less frequent, use for this option would be as a means for assuring a security policy case where only a subset of taints, or no single taint (in paranoid mode), is allowed for the running system. The optional switch 'nousertaint' is handy in this particular scenario, as it will avoid userspace induced crashes by writes to sysctl interface /proc/sys/kernel/tainted causing false positive hits for such policies. [akpm@linux-foundation.org: tweak kernel-parameters.txt wording] Suggested-by: NQian Cai <cai@lca.pw> Signed-off-by: NRafael Aquini <aquini@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Bunk <bunk@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Takashi Iwai <tiwai@suse.de> Link: http://lkml.kernel.org/r/20200515175502.146720-1-aquini@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 5月, 2020 2 次提交
-
-
由 Stephen Kitt 提交于
This documents ignore-unaligned-usertrap, unaligned-dump-stack, and unaligned-trap, based on arch/arc/kernel/unaligned.c, arch/ia64/kernel/unaligned.c, and arch/parisc/kernel/unaligned.c. While we're at it, integrate unaligned-memory-access.txt into the docs tree. Signed-off-by: NStephen Kitt <steve@sk2.org> Link: https://lore.kernel.org/r/20200515212443.5012-1-steve@sk2.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Stephen Kitt 提交于
This is a read-only export of NGROUPS_MAX. Signed-off-by: NStephen Kitt <steve@sk2.org> Link: https://lore.kernel.org/r/20200518145836.15816-1-steve@sk2.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 18 5月, 2020 1 次提交
-
-
由 Jonathan Corbet 提交于
This reverts commit 2f4c3306. The changes here were fine, but there's a non-documentation change to sysctl.c that makes messes elsewhere; those changes should have been done independently. Signed-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 16 5月, 2020 1 次提交
-
-
由 Stephen Kitt 提交于
This is a read-only export of NGROUPS_MAX, so this patch also changes the declarations in kernel/sysctl.c to const. Signed-off-by: NStephen Kitt <steve@sk2.org> Reviewed-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200515160222.7994-1-steve@sk2.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 05 5月, 2020 1 次提交
-
-
由 Stephen Kitt 提交于
Based on the firmware fallback mechanisms documentation and the implementation in drivers/base/firmware_loader/fallback.c. Signed-off-by: NStephen Kitt <steve@sk2.org> Link: https://lore.kernel.org/r/20200429205757.8677-2-steve@sk2.orgSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-