- 24 1月, 2023 3 次提交
-
-
由 Kees Cook 提交于
commit 9fc9e278 upstream. Like oops_limit, add warn_limit for limiting the number of warnings when panic_on_warn is not set. Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Petr Mladek <pmladek@suse.com> Cc: tangmeng <tangmeng@uniontech.com> Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-doc@vger.kernel.org Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221117234328.594699-5-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Kees Cook 提交于
commit de92f657 upstream. In preparation for keeping oops_limit logic in sync with warn_limit, have oops_limit == 0 disable checking the Oops counter. Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Huang Ying <ying.huang@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-doc@vger.kernel.org Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Jann Horn 提交于
commit d4ccd54d upstream. Many Linux systems are configured to not panic on oops; but allowing an attacker to oops the system **really** often can make even bugs that look completely unexploitable exploitable (like NULL dereferences and such) if each crash elevates a refcount by one or a lock is taken in read mode, and this causes a counter to eventually overflow. The most interesting counters for this are 32 bits wide (like open-coded refcounts that don't use refcount_t). (The ldsem reader count on 32-bit platforms is just 16 bits, but probably nobody cares about 32-bit platforms that much nowadays.) So let's panic the system if the kernel is constantly oopsing. The speed of oopsing 2^32 times probably depends on several factors, like how long the stack trace is and which unwinder you're using; an empirically important one is whether your console is showing a graphical environment or a text console that oopses will be printed to. In a quick single-threaded benchmark, it looks like oopsing in a vfork() child with a very short stack trace only takes ~510 microseconds per run when a graphical console is active; but switching to a text console that oopses are printed to slows it down around 87x, to ~45 milliseconds per run. (Adding more threads makes this faster, but the actual oops printing happens under &die_lock on x86, so you can maybe speed this up by a factor of around 2 and then any further improvement gets eaten up by lock contention.) It looks like it would take around 8-12 days to overflow a 32-bit counter with repeated oopsing on a multi-core X86 system running a graphical environment; both me (in an X86 VM) and Seth (with a distro kernel on normal hardware in a standard configuration) got numbers in that ballpark. 12 days aren't *that* short on a desktop system, and you'd likely need much longer on a typical server system (assuming that people don't run graphical desktop environments on their servers), and this is a *very* noisy and violent approach to exploiting the kernel; and it also seems to take orders of magnitude longer on some machines, probably because stuff like EFI pstore will slow it down a ton if that's active. Signed-off-by: NJann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@google.comReviewed-by: NLuis Chamberlain <mcgrof@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221117234328.594699-2-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 07 1月, 2023 1 次提交
-
-
由 Kim Phillips 提交于
commit 1198d231 upstream. Currently, these options cause the following libkmod error: libkmod: ERROR ../libkmod/libkmod-config.c:489 kcmdline_parse_result: \ Ignoring bad option on kernel command line while parsing module \ name: 'ivrs_xxxx[XX:XX' Fix by introducing a new parameter format for these options and throw a warning for the deprecated format. Users are still allowed to omit the PCI Segment if zero. Adding a Link: to the reason why we're modding the syntax parsing in the driver and not in libkmod. Fixes: ca3bf5d4 ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-modules/20200310082308.14318-2-lucas.demarchi@intel.com/Reported-by: NKim Phillips <kim.phillips@amd.com> Co-developed-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NKim Phillips <kim.phillips@amd.com> Link: https://lore.kernel.org/r/20220919155638.391481-2-kim.phillips@amd.comSigned-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 31 12月, 2022 1 次提交
-
-
由 Guilherme G. Piccoli 提交于
[ Upstream commit 72720937 ] Commit b041b525 ("x86/split_lock: Make life miserable for split lockers") changed the way the split lock detector works when in "warn" mode; basically, it not only shows the warn message, but also intentionally introduces a slowdown through sleeping plus serialization mechanism on such task. Based on discussions in [0], seems the warning alone wasn't enough motivation for userspace developers to fix their applications. This slowdown is enough to totally break some proprietary (aka. unfixable) userspace[1]. Happens that originally the proposal in [0] was to add a new mode which would warns + slowdown the "split locking" task, keeping the old warn mode untouched. In the end, that idea was discarded and the regular/default "warn" mode now slows down the applications. This is quite aggressive with regards proprietary/legacy programs that basically are unable to properly run in kernel with this change. While it is understandable that a malicious application could DoS by split locking, it seems unacceptable to regress old/proprietary userspace programs through a default configuration that previously worked. An example of such breakage was reported in [1]. Add a sysctl to allow controlling the "misery mode" behavior, as per Thomas suggestion on [2]. This way, users running legacy and/or proprietary software are allowed to still execute them with a decent performance while still observing the warning messages on kernel log. [0] https://lore.kernel.org/lkml/20220217012721.9694-1-tony.luck@intel.com/ [1] https://github.com/doitsujin/dxvk/issues/2938 [2] https://lore.kernel.org/lkml/87pmf4bter.ffs@tglx/ [ dhansen: minor changelog tweaks, including clarifying the actual problem ] Fixes: b041b525 ("x86/split_lock: Make life miserable for split lockers") Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGuilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NTony Luck <tony.luck@intel.com> Tested-by: NAndre Almeida <andrealmeid@igalia.com> Link: https://lore.kernel.org/all/20221024200254.635256-1-gpiccoli%40igalia.comSigned-off-by: NSasha Levin <sashal@kernel.org>
-
- 23 11月, 2022 2 次提交
-
-
由 Perry Yuan 提交于
Add a new amd pstate driver command line option to enable driver passive working mode via MSR and shared memory interface to request desired performance within abstract scale and the power management firmware (SMU) convert the perf requests into actual hardware pstates. Also the `disable` parameter can disable the pstate driver loading by adding `amd_pstate=disable` to kernel command line. Acked-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NGautham R. Shenoy <gautham.shenoy@amd.com> Tested-by: NWyes Karny <wyes.karny@amd.com> Signed-off-by: NPerry Yuan <Perry.Yuan@amd.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Perry Yuan 提交于
Introduce the `amd_pstate` driver new working mode with `amd_pstate=passive` added to kernel command line. If there is no passive mode enabled by user, amd_pstate driver will be disabled by default for now. Acked-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NGautham R. Shenoy <gautham.shenoy@amd.com> Tested-by: NWyes Karny <wyes.karny@amd.com> Signed-off-by: NPerry Yuan <Perry.Yuan@amd.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 25 10月, 2022 1 次提交
-
-
由 Hans Verkuil 提交于
The example on how to use and test Capture Overlay specified the wrong video device node. Back in 2015 the loop_video control moved from the output device to the capture device, but this example code is still referring to the output video device. Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: NMauro Carvalho Chehab <mchehab@kernel.org>
-
- 19 10月, 2022 1 次提交
-
-
由 Milan Broz 提交于
Add documentation that was missing from commit 5721d4e5 ("dm verity: Add optional "try_verify_in_tasklet" feature"). Signed-off-by: NMilan Broz <gmazyland@gmail.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org>
-
- 14 10月, 2022 1 次提交
-
-
由 Bagas Sanjaya 提交于
Commit d206cef0 ("ACPI: docs: Drop useless DSDT override documentation") removes useless DSDT override documentation. However, the commit forgets to prune the documentation entry from table of contents of ACPI admin guide documentation, hence triggers Sphinx warning: Documentation/admin-guide/acpi/index.rst:8: WARNING: toctree contains reference to nonexisting document 'admin-guide/acpi/dsdt-override' Prune the entry to fix the warning. Fixes: d206cef0 ("ACPI: docs: Drop useless DSDT override documentation") Signed-off-by: NBagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 11 10月, 2022 1 次提交
-
-
由 Juergen Gross 提交于
Instead of always doing the safe variants for reading and writing MSRs in Xen PV guests, make the behavior controllable via Kconfig option and a boot parameter. The default will be the current behavior, which is to always use the safe variant. Signed-off-by: NJuergen Gross <jgross@suse.com>
-
- 06 10月, 2022 1 次提交
-
-
由 Meng Li 提交于
Introduce the AMD P-State unit test module design and implementation. It also talks about kselftest and how to use. Signed-off-by: NMeng Li <li.meng@amd.com> Acked-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NShuah Khan <skhan@linuxfoundation.org> Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
-
- 04 10月, 2022 7 次提交
-
-
由 Johannes Weiner 提交于
Since 2d1c4980 ("mm: memcontrol: make swap tracking an integral part of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP. Update the sites accordingly and drop the symbol. [ While touching the docs, remove two references to CONFIG_MEMCG_KMEM, which hasn't been a user-visible symbol for over half a decade. ] Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NShakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Johannes Weiner 提交于
The swapaccounting= commandline option already does very little today. To close a trivial containment failure case, the swap ownership tracking part of the swap controller has recently become mandatory (see commit 2d1c4980 ("mm: memcontrol: make swap tracking an integral part of memory control") for details), which makes up the majority of the work during swapout, swapin, and the swap slot map. The only thing left under this flag is the page_counter operations and the visibility of the swap control files in the first place, which are rather meager savings. There also aren't many scenarios, if any, where controlling the memory of a cgroup while allowing it unlimited access to a global swap space is a workable resource isolation strategy. On the other hand, there have been several bugs and confusion around the many possible swap controller states (cgroup1 vs cgroup2 behavior, memory accounting without swap accounting, memcg runtime disabled). This puts the maintenance overhead of retaining the toggle above its practical benefits. Deprecate it. Link: https://lkml.kernel.org/r/20220926135704.400818-3-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Suggested-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NShakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zach O'Keefe 提交于
The main benefit of THPs are that they can be mapped at the pmd level, increasing the likelihood of TLB hit and spending less cycles in page table walks. pte-mapped hugepages - that is - hugepage-aligned compound pages of order HPAGE_PMD_ORDER mapped by ptes - although being contiguous in physical memory, don't have this advantage. In fact, one could argue they are detrimental to system performance overall since they occupy a precious hugepage-aligned/sized region of physical memory that could otherwise be used more effectively. Additionally, pte-mapped hugepages can be the cheapest memory to collapse for khugepaged since no new hugepage allocation or copying of memory contents is necessary - we only need to update the mapping page tables. In the anonymous collapse path, we are able to collapse pte-mapped hugepages (albeit, perhaps suboptimally), but the file/shmem path makes no effort when compound pages (of any order) are encountered. Identify pte-mapped hugepages in the file/shmem collapse path. The final step of which makes a racy check of the value of the pmd to ensure it maps a pte table. This should be fine, since races that result in false-positive (i.e. attempt collapse even though we shouldn't) will fail later in collapse_pte_mapped_thp() once we actually lock mmap_lock and reinspect the pmd value. Races that result in false-negatives (i.e. where we decide to not attempt collapse, but should have) shouldn't be an issue, since in the worst case, we do nothing - which is what we've done up to this point. We make a similar check in retract_page_tables(). If we do think we've found a pte-mapped hugepgae in khugepaged context, attempt to update page tables mapping this hugepage. Note that these collapses still count towards the /sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed counter, and if the pte-mapped hugepage was also mapped into multiple process' address spaces, could be incremented for each page table update. Since we increment the counter when a pte-mapped hugepage is successfully added to the list of to-collapse pte-mapped THPs, it's possible that we never actually update the page table either. This is different from how file/shmem pages_collapsed accounting works today where only a successful page cache update is counted (it's also possible here that no page tables are actually changed). Though it incurs some slop, this is preferred to either not accounting for the event at all, or plumbing through data in struct mm_slot on whether to account for the collapse or not. Also note that work still needs to be done to support arbitrary compound pages, and that this should all be converted to using folios. [shy828301@gmail.com: Spelling mistake, update comment, and add Documentation] Link: https://lore.kernel.org/linux-mm/CAHbLzkpHwZxFzjfX9nxVoRhzup8WMjMfyL6Xiq8mZ9M-N3ombw@mail.gmail.com/ Link: https://lkml.kernel.org/r/20220907144521.3115321-3-zokeefe@google.com Link: https://lkml.kernel.org/r/20220922224046.1143204-3-zokeefe@google.comSigned-off-by: NZach O'Keefe <zokeefe@google.com> Reviewed-by: NYang Shi <shy828301@gmail.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Kennelly <ckennelly@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com> Cc: SeongJae Park <sj@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Liu Shixin 提交于
A user who reads THP_ZERO_PAGE_ALLOC may be more concerned about the huge zero pages that are really allocated for thp. It is misleading to increase THP_ZERO_PAGE_ALLOC twice if two threads call get_huge_zero_page concurrently. Don't increase the value if the huge page is not really used. Update Documentation/admin-guide/mm/transhuge.rst to suit. Link: https://lkml.kernel.org/r/20220909021653.3371879-1-liushixin2@huawei.comSigned-off-by: NLiu Shixin <liushixin2@huawei.com> Cc: Alexander Potapenko <glider@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 SeongJae Park 提交于
Commit b1840272 ("Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface") announced the DAMON debugfs interface deprecation plan, but it is not so aggressively announced. As the deprecation time is coming, this commit makes the announce more easy to be found by adding the note at the beginning of the DAMON debugfs interface usage document. Link: https://lkml.kernel.org/r/20220909202901.57977-8-sj@kernel.orgSigned-off-by: NSeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Yun Levi <ppbuk5246@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 SeongJae Park 提交于
'Getting Started' document of DAMON says DAMON user-space tool, damo[1], is using DAMON debugfs interface, and therefore it needs to ensure debugfs is mounted. However, the latest version of the tool is using DAMON sysfs interface. Moreover, DAMON debugfs interface is going to be deprecated as announced by commit b1840272 ("Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface"). This commit therefore update the document to tell readers about DAMON sysfs interface dependency instead and never mention about debugfs interface, which will be deprecated. [1] https://github.com/awslabs/damo Link: https://lkml.kernel.org/r/20220909202901.57977-7-sj@kernel.orgSigned-off-by: NSeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Yun Levi <ppbuk5246@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 SeongJae Park 提交于
The title of the DAMON document for admin-guide, 'Monitoring Data Accesses', could confuse readers in some ways. First of all, DAMON is not the only single way for data access monitoring. And the document is for not only the data access monitoring but also data access pattern based memory management optimizations (DAMOS). This commit updates the title to 'DAMON: Data Access MONitor', which more explicitly explains what the document describes. Link: https://lkml.kernel.org/r/20220909202901.57977-5-sj@kernel.org Fixes: c4ba6014 ("Documentation: add documents for DAMON") Signed-off-by: NSeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Yun Levi <ppbuk5246@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 01 10月, 2022 1 次提交
-
-
由 Joe Fradley 提交于
This patch adds the kunit.enable module parameter that will need to be set to true in addition to KUNIT being enabled for KUnit tests to run. The default value is true giving backwards compatibility. However, for the production+testing use case the new config option KUNIT_DEFAULT_ENABLED can be set to N requiring the tester to opt-in by passing kunit.enable=1 to the kernel. Signed-off-by: NJoe Fradley <joefradley@google.com> Reviewed-by: NDavid Gow <davidgow@google.com> Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
-
- 28 9月, 2022 6 次提交
-
-
由 Rafael J. Wysocki 提交于
Because https://01.org/linux-acpi web site has become permanently inaccessible, the "Overriding DSDT" document in the kernel tree pointing to it as the main source of information is useless (and the config option name mentioned by it is incorrect), so drop it and drop the pointer to it from the ACPI Kconfig. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Hoi Pok Wu 提交于
should be kB instead of Kb Signed-off-by: NHoi Pok Wu <wuhoipok@gmail.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Link: https://lore.kernel.org/r/20220922030645.9719-1-wuhoipok@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Lin Yujun 提交于
commit 7c693f54 ("x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS") adds the "ibrs " option in Documentation/admin-guide/kernel-parameters.txt but omits it to Documentation/admin-guide/hw-vuln/spectre.rst, add it. Signed-off-by: NLin Yujun <linyujun809@huawei.com> Link: https://lore.kernel.org/r/20220830123614.23007-1-linyujun809@huawei.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Lukas Bulwahn 提交于
The current section 'If something goes wrong' makes a number of suggestions for debugging, bug hunting and reporting issues, which are quite briefly described in that section. However, the suggestions are also well covered in other kernel documentation or sometimes simply outdated. Here, each suggestion in that section is summarized, and then followed with its assessment, and the derived action for each suggestion: - use MAINTAINERS and mailing list: covered in 'Reporting issues', summarized in the short guide, detailed in its further section. Reporting issues even provides some specific examples that guides readers well through the needed steps. Refer to 'Reporting issues'. - contact Linus Torvalds: probably outdated as currently described. nevertheless covered in 'Reporting issues'. Reporting issues points out to contact the relevant kernel maintainers first, and after some patience and failed attempts with those maintainers, contacting Linus Torvalds might be okay. Refer to 'Reporting issues'. - tell what kernel, how to duplicate, the setup, if the problem is new or old and when did you notice: covered in 'Reporting issues', especially in Step-by-step guide how to report issues to the kernel maintainers. Refer to 'Reporting issues'. - duplicate kernel bug reports exactly: covered in 'Reporting issues', especially in Write and send the report. Refer to 'Reporting issues'. - read 'Bug hunting': keep this reference. Refer to 'Bug hunting'. - compile the kernel with CONFIG_KALLSYMS: covered in 'Reporting issues', especially in Decode failure messages. Refer to 'Reporting issues'. - alternatively, use ksymoops: ksymoops at the mentioned URL seems not to be maintained anymore. It was released roughly once a year until version 2.4.11 in 2005, but has not seen a new release since then. The information in ./scripts/ksymoops/README is from 1999, and does not give more insight on its actual maintenance state either. Ksymoops is mentioned as system utility in changes.rst, but also not recommended there. Drop the explanation on using ksymoops. - alternatively, lookup dump manually with the EIP and nm to determine the function in which the kernel crashes: this method seems already a quite advanced and low-level debugging method. Even all the further references on bug hunting and debugging do not mention it. Drop this alternative method and limit mentioning methods explained in the other existing kernel documentation. - read 'Reporting issues': keep this reference. Refer to 'Reporting issues'. - use gdb for debugging: some specific details, e.g., edit arch/x86/Makefile, are probably outdated or limited to one (historic important) setup. Using gdb is covered in 'Bug hunting', 'Debugging kernel and modules via gdb' and 'Using kgdb, kdb and the kernel debugger internals'. Refer to those three documents. Overall, it is sufficient to refer to reporting-issues.rst, bug-hunting.rst, gdb-kernel-debugging.rst and kgdb.rst and this way cover the existing suggestions. 'Reporting issues' is quite new and probably up to date. 'Bug hunting', 'Debugging kernel and modules via gdb' and 'Using kgdb, kdb and the kernel debugger internals' might need some revisit and update, but they are generally in an acceptable state for referring to them. Replace the existing suggestions by reference to other existing kernel documentation covering those suggestions---partly even nicely summarized and then explained in greater detail. Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20220720041325.15693-3-lukas.bulwahn@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Lukas Bulwahn 提交于
Running a.out user programs with the latest kernel release is a very rare and uncommon use case nowadays. The support of a.out user programs is only remaining for the alpha architecture and is not defined and activated in the architecture's Kconfig (so even the activation of this support requires to modify the Kconfig file and not just kernel build configuration). The discussion on a.out support in 2019 (see Link) shows that the support of a.out user programs is just remaining for a special corner case from some (alpha architecture) users. There is no need to point out and mention this special feature to the general audience of kernel users. Delete the reference to this historic and special feature. Link: https://lore.kernel.org/all/CAHk-=wgt7M6yA5BJCJo0nF22WgPJnN8CvViL9CAJmd+S+Civ6w@mail.gmail.com/Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20220720041325.15693-2-lukas.bulwahn@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
由 Akhil Raj 提交于
I have removed repeated `the` inside the documentation Signed-off-by: NAkhil Raj <lf32.dev@gmail.com> Link: https://lore.kernel.org/r/20220827145359.32599-1-lf32.dev@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
-
- 27 9月, 2022 2 次提交
-
-
由 xu xin 提交于
Add the description of KSM profit and how to determine it separately in system-wide range and inner a single process. Link: https://lkml.kernel.org/r/20220830144003.299870-1-xu.xin16@zte.com.cnSigned-off-by: Nxu xin <xu.xin16@zte.com.cn> Reviewed-by: NXiaokai Ran <ran.xiaokai@zte.com.cn> Reviewed-by: NYang Yang <yang.yang29@zte.com.cn> Reviewed-by: NBagas Sanjaya <bagasdotme@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Yu Zhao 提交于
Add an admin guide. Link: https://lkml.kernel.org/r/20220918080010.2920238-14-yuzhao@google.comSigned-off-by: NYu Zhao <yuzhao@google.com> Acked-by: NBrian Geffon <bgeffon@google.com> Acked-by: NJan Alexander Steffens (heftig) <heftig@archlinux.org> Acked-by: NOleksandr Natalenko <oleksandr@natalenko.name> Acked-by: NSteven Barrett <steven@liquorix.net> Acked-by: NSuleiman Souhlal <suleiman@google.com> Acked-by: NMike Rapoport <rppt@linux.ibm.com> Tested-by: NDaniel Byrne <djbyrne@mtu.edu> Tested-by: NDonald Carr <d@chaos-reins.com> Tested-by: NHolger Hoffstätte <holger@applied-asynchrony.com> Tested-by: NKonstantin Kharlamov <Hi-Angel@yandex.ru> Tested-by: NShuang Zhai <szhai2@cs.rochester.edu> Tested-by: NSofia Trinh <sofia.trinh@edi.works> Tested-by: NVaibhav Jain <vaibhav@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Barry Song <baohua@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michael Larabel <Michael@MichaelLarabel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 26 9月, 2022 1 次提交
-
-
由 Christophe Leroy 提交于
CONFIG_PPC_FSL_BOOK3E is redundant with CONFIG_PPC_E500. Rename it so that CONFIG_PPC_FSL_BOOK3E can be removed later. Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d3d42b395c09e66b0705fda1e51779f33e13ac38.1663606876.git.christophe.leroy@csgroup.eu
-
- 22 9月, 2022 1 次提交
-
-
由 Shuai Xue 提交于
Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: NShuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Link: https://lore.kernel.org/r/20220914022326.88550-2-xueshuai@linux.alibaba.comSigned-off-by: NWill Deacon <will@kernel.org>
-
- 17 9月, 2022 1 次提交
-
-
由 Yauheni Kaliuta 提交于
The full CAP_SYS_ADMIN requirement for blinding looks too strict nowadays. These days given unprivileged BPF is disabled by default, the main users for constant blinding coming from unprivileged in particular via cBPF -> eBPF migration (e.g. old-style socket filters). Signed-off-by: NYauheni Kaliuta <ykaliuta@redhat.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220831090655.156434-1-ykaliuta@redhat.com Link: https://lore.kernel.org/bpf/20220905090149.61221-1-ykaliuta@redhat.com
-
- 16 9月, 2022 1 次提交
-
-
由 Kefeng Wang 提交于
As commit 559089e0 ("vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP"), the use of hugepage mappings for vmalloc is an opt-in strategy, so it is saftly to support huge vmalloc mappings on arm64, for now, it is used in kvmalloc() and alloc_large_system_hash(). Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20220911044423.139229-1-wangkefeng.wang@huawei.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 12 9月, 2022 5 次提交
-
-
由 Petr Vorel 提交于
Print the machine hardware name (UTS_MACHINE) in /proc/sys/kernel/arch. This helps people who debug kernel with initramfs with minimal environment (i.e. without coreutils or even busybox) or allow to open sysfs file instead of run 'uname -m' in high level languages. Link: https://lkml.kernel.org/r/20220901194403.3819-1-pvorel@suse.czSigned-off-by: NPetr Vorel <pvorel@suse.cz> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: David Sterba <dsterba@suse.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Li Zhe 提交于
In commit 2f1ee091 ("Revert "mm: use early_pfn_to_nid in page_ext_init""), we call page_ext_init() after page_alloc_init_late() to avoid some panic problem. It seems that we cannot track early page allocations in current kernel even if page structure has been initialized early. This patch introduces a new boot parameter 'early_page_ext' to resolve this problem. If we pass it to the kernel, page_ext_init() will be moved up and the feature 'deferred initialization of struct pages' will be disabled to initialize the page allocator early and prevent the panic problem above. It can help us to catch early page allocations. This is useful especially when we find that the free memory value is not the same right after different kernel booting. [akpm@linux-foundation.org: fix section issue by removing __meminitdata] Link: https://lkml.kernel.org/r/20220825102714.669-1-lizhe.67@bytedance.comSigned-off-by: NLi Zhe <lizhe.67@bytedance.com> Suggested-by: NMichal Hocko <mhocko@suse.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Huang Ying 提交于
In NUMA balancing memory tiering mode, if there are hot pages in slow memory node and cold pages in fast memory node, we need to promote/demote hot/cold pages between the fast and cold memory nodes. A choice is to promote/demote as fast as possible. But the CPU cycles and memory bandwidth consumed by the high promoting/demoting throughput will hurt the latency of some workload because of accessing inflating and slow memory bandwidth contention. A way to resolve this issue is to restrict the max promoting/demoting throughput. It will take longer to finish the promoting/demoting. But the workload latency will be better. This is implemented in this patch as the page promotion rate limit mechanism. The number of the candidate pages to be promoted to the fast memory node via NUMA balancing is counted, if the count exceeds the limit specified by the users, the NUMA balancing promotion will be stopped until the next second. A new sysctl knob kernel.numa_balancing_promote_rate_limit_MBps is added for the users to specify the limit. Link: https://lkml.kernel.org/r/20220713083954.34196-3-ying.huang@intel.comSigned-off-by: N"Huang, Ying" <ying.huang@intel.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Tested-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: osalvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Wei Xu <weixugc@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Zhong Jiang <zhongjiang-ali@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Charan Teja Kalla 提交于
Currently only 12 characters of the cma name is being used as the debug directories where as the cma name can be of length CMA_MAX_NAME(=64) characters. One side problem with this is having 2 cma's with first common 12 characters would end up in trying to create directories with same name and fails with -EEXIST thus can limit cma debug functionality. The 'cma-' prefix is used initially where cma areas don't have any names and are represented by simple integer values. Since now each cma would be having its own name, drop 'cma-' prefix for the cma debug directories as they are clearly evident that they are for cma debug through creating them in /sys/kernel/debug/cma/ path. Link: https://lkml.kernel.org/r/1660223729-22461-1-git-send-email-quic_charante@quicinc.comSigned-off-by: NCharan Teja Kalla <quic_charante@quicinc.com> Cc: David Hildenbrand <david@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Pavan Kondeti <quic_pkondeti@quicinc.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Axel Rasmussen 提交于
Explain the different ways to create a new userfaultfd, and how access control works for each way. [axelrasmussen@google.com: improve wording in documentation, per Mike] Link: https://lkml.kernel.org/r/20220819205201.658693-5-axelrasmussen@google.com Link: https://lkml.kernel.org/r/20220808175614.3885028-5-axelrasmussen@google.comSigned-off-by: NAxel Rasmussen <axelrasmussen@google.com> Acked-by: NPeter Xu <peterx@redhat.com> Reviewed-by: NShuah Khan <skhan@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry V. Levin <ldv@altlinux.org> Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nadav Amit <namit@vmware.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 10 9月, 2022 1 次提交
-
-
由 Liu Song 提交于
In our environment, it was found that the mitigation BHB has a great impact on the benchmark performance. For example, in the lmbench test, the "process fork && exit" test performance drops by 20%. So it is necessary to have the ability to turn off the mitigation individually through cmdline, thus avoiding having to compile the kernel by adjusting the config. Signed-off-by: NLiu Song <liusong@linux.alibaba.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/1661514050-22263-1-git-send-email-liusong@linux.alibaba.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 09 9月, 2022 2 次提交
-
-
由 Chengming Zhou 提交于
PSI accounts stalls for each cgroup separately and aggregates it at each level of the hierarchy. This may cause non-negligible overhead for some workloads when under deep level of the hierarchy. commit 3958e2d0 ("cgroup: make per-cgroup pressure stall tracking configurable") make PSI to skip per-cgroup stall accounting, only account system-wide to avoid this each level overhead. But for our use case, we also want leaf cgroup PSI stats accounted for userspace adjustment on that cgroup, apart from only system-wide adjustment. So this patch introduce a per-cgroup PSI accounting disable/re-enable interface "cgroup.pressure", which is a read-write single value file that allowed values are "0" and "1", the defaults is "1" so per-cgroup PSI stats is enabled by default. Implementation details: It should be relatively straight-forward to disable and re-enable state aggregation, time tracking, averaging on a per-cgroup level, if we can live with losing history from while it was disabled. I.e. the avgs will restart from 0, total= will have gaps. But it's hard or complex to stop/restart groupc->tasks[] updates, which is not implemented in this patch. So we always update groupc->tasks[] and PSI_ONCPU bit in psi_group_change() even when the cgroup PSI stats is disabled. Suggested-by: NJohannes Weiner <hannes@cmpxchg.org> Suggested-by: NTejun Heo <tj@kernel.org> Signed-off-by: NChengming Zhou <zhouchengming@bytedance.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Link: https://lkml.kernel.org/r/20220907090332.2078-1-zhouchengming@bytedance.com
-
由 Chengming Zhou 提交于
Now PSI already tracked workload pressure stall information for CPU, memory and IO. Apart from these, IRQ/SOFTIRQ could have obvious impact on some workload productivity, such as web service workload. When CONFIG_IRQ_TIME_ACCOUNTING, we can get IRQ/SOFTIRQ delta time from update_rq_clock_task(), in which we can record that delta to CPU curr task's cgroups as PSI_IRQ_FULL status. Note we don't use PSI_IRQ_SOME since IRQ/SOFTIRQ always happen in the current task on the CPU, make nothing productive could run even if it were runnable, so we only use PSI_IRQ_FULL. Signed-off-by: NChengming Zhou <zhouchengming@bytedance.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/r/20220825164111.29534-8-zhouchengming@bytedance.com
-