1. 24 1月, 2023 3 次提交
    • K
      panic: Introduce warn_limit · f53b6dda
      Kees Cook 提交于
      commit 9fc9e278 upstream.
      
      Like oops_limit, add warn_limit for limiting the number of warnings when
      panic_on_warn is not set.
      
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
      Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
      Cc: Eric Biggers <ebiggers@google.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: tangmeng <tangmeng@uniontech.com>
      Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
      Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: linux-doc@vger.kernel.org
      Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20221117234328.594699-5-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f53b6dda
    • K
      exit: Allow oops_limit to be disabled · e0738725
      Kees Cook 提交于
      commit de92f657 upstream.
      
      In preparation for keeping oops_limit logic in sync with warn_limit,
      have oops_limit == 0 disable checking the Oops counter.
      
      Cc: Jann Horn <jannh@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
      Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
      Cc: Eric Biggers <ebiggers@google.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-doc@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0738725
    • J
      exit: Put an upper limit on how often we can oops · 767997ef
      Jann Horn 提交于
      commit d4ccd54d upstream.
      
      Many Linux systems are configured to not panic on oops; but allowing an
      attacker to oops the system **really** often can make even bugs that look
      completely unexploitable exploitable (like NULL dereferences and such) if
      each crash elevates a refcount by one or a lock is taken in read mode, and
      this causes a counter to eventually overflow.
      
      The most interesting counters for this are 32 bits wide (like open-coded
      refcounts that don't use refcount_t). (The ldsem reader count on 32-bit
      platforms is just 16 bits, but probably nobody cares about 32-bit platforms
      that much nowadays.)
      
      So let's panic the system if the kernel is constantly oopsing.
      
      The speed of oopsing 2^32 times probably depends on several factors, like
      how long the stack trace is and which unwinder you're using; an empirically
      important one is whether your console is showing a graphical environment or
      a text console that oopses will be printed to.
      In a quick single-threaded benchmark, it looks like oopsing in a vfork()
      child with a very short stack trace only takes ~510 microseconds per run
      when a graphical console is active; but switching to a text console that
      oopses are printed to slows it down around 87x, to ~45 milliseconds per
      run.
      (Adding more threads makes this faster, but the actual oops printing
      happens under &die_lock on x86, so you can maybe speed this up by a factor
      of around 2 and then any further improvement gets eaten up by lock
      contention.)
      
      It looks like it would take around 8-12 days to overflow a 32-bit counter
      with repeated oopsing on a multi-core X86 system running a graphical
      environment; both me (in an X86 VM) and Seth (with a distro kernel on
      normal hardware in a standard configuration) got numbers in that ballpark.
      
      12 days aren't *that* short on a desktop system, and you'd likely need much
      longer on a typical server system (assuming that people don't run graphical
      desktop environments on their servers), and this is a *very* noisy and
      violent approach to exploiting the kernel; and it also seems to take orders
      of magnitude longer on some machines, probably because stuff like EFI
      pstore will slow it down a ton if that's active.
      Signed-off-by: NJann Horn <jannh@google.com>
      Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@google.comReviewed-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20221117234328.594699-2-keescook@chromium.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      767997ef
  2. 07 1月, 2023 1 次提交
  3. 31 12月, 2022 1 次提交
  4. 23 11月, 2022 2 次提交
  5. 25 10月, 2022 1 次提交
  6. 19 10月, 2022 1 次提交
  7. 14 10月, 2022 1 次提交
  8. 11 10月, 2022 1 次提交
  9. 06 10月, 2022 1 次提交
  10. 04 10月, 2022 7 次提交
    • J
      mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol · e55b9f96
      Johannes Weiner 提交于
      Since 2d1c4980 ("mm: memcontrol: make swap tracking an integral part
      of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config
      option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP.
      
      Update the sites accordingly and drop the symbol.
      
      [ While touching the docs, remove two references to CONFIG_MEMCG_KMEM,
        which hasn't been a user-visible symbol for over half a decade. ]
      
      Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NShakeel Butt <shakeelb@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      e55b9f96
    • J
      mm: memcontrol: deprecate swapaccounting=0 mode · b25806dc
      Johannes Weiner 提交于
      The swapaccounting= commandline option already does very little today.  To
      close a trivial containment failure case, the swap ownership tracking part
      of the swap controller has recently become mandatory (see commit
      2d1c4980 ("mm: memcontrol: make swap tracking an integral part of
      memory control") for details), which makes up the majority of the work
      during swapout, swapin, and the swap slot map.
      
      The only thing left under this flag is the page_counter operations and the
      visibility of the swap control files in the first place, which are rather
      meager savings.  There also aren't many scenarios, if any, where
      controlling the memory of a cgroup while allowing it unlimited access to a
      global swap space is a workable resource isolation strategy.
      
      On the other hand, there have been several bugs and confusion around the
      many possible swap controller states (cgroup1 vs cgroup2 behavior, memory
      accounting without swap accounting, memcg runtime disabled).
      
      This puts the maintenance overhead of retaining the toggle above its
      practical benefits.  Deprecate it.
      
      Link: https://lkml.kernel.org/r/20220926135704.400818-3-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Suggested-by: NShakeel Butt <shakeelb@google.com>
      Reviewed-by: NShakeel Butt <shakeelb@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      b25806dc
    • Z
      mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmds · 58ac9a89
      Zach O'Keefe 提交于
      The main benefit of THPs are that they can be mapped at the pmd level,
      increasing the likelihood of TLB hit and spending less cycles in page
      table walks.  pte-mapped hugepages - that is - hugepage-aligned compound
      pages of order HPAGE_PMD_ORDER mapped by ptes - although being contiguous
      in physical memory, don't have this advantage.  In fact, one could argue
      they are detrimental to system performance overall since they occupy a
      precious hugepage-aligned/sized region of physical memory that could
      otherwise be used more effectively.  Additionally, pte-mapped hugepages
      can be the cheapest memory to collapse for khugepaged since no new
      hugepage allocation or copying of memory contents is necessary - we only
      need to update the mapping page tables.
      
      In the anonymous collapse path, we are able to collapse pte-mapped
      hugepages (albeit, perhaps suboptimally), but the file/shmem path makes no
      effort when compound pages (of any order) are encountered.
      
      Identify pte-mapped hugepages in the file/shmem collapse path.  The
      final step of which makes a racy check of the value of the pmd to
      ensure it maps a pte table.  This should be fine, since races that
      result in false-positive (i.e.  attempt collapse even though we
      shouldn't) will fail later in collapse_pte_mapped_thp() once we
      actually lock mmap_lock and reinspect the pmd value.  Races that result
      in false-negatives (i.e.  where we decide to not attempt collapse, but
      should have) shouldn't be an issue, since in the worst case, we do
      nothing - which is what we've done up to this point.  We make a similar
      check in retract_page_tables().  If we do think we've found a
      pte-mapped hugepgae in khugepaged context, attempt to update page
      tables mapping this hugepage.
      
      Note that these collapses still count towards the
      /sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed counter,
      and if the pte-mapped hugepage was also mapped into multiple process'
      address spaces, could be incremented for each page table update.  Since we
      increment the counter when a pte-mapped hugepage is successfully added to
      the list of to-collapse pte-mapped THPs, it's possible that we never
      actually update the page table either.  This is different from how
      file/shmem pages_collapsed accounting works today where only a successful
      page cache update is counted (it's also possible here that no page tables
      are actually changed).  Though it incurs some slop, this is preferred to
      either not accounting for the event at all, or plumbing through data in
      struct mm_slot on whether to account for the collapse or not.
      
      Also note that work still needs to be done to support arbitrary compound
      pages, and that this should all be converted to using folios.
      
      [shy828301@gmail.com: Spelling mistake, update comment, and add Documentation]
        Link: https://lore.kernel.org/linux-mm/CAHbLzkpHwZxFzjfX9nxVoRhzup8WMjMfyL6Xiq8mZ9M-N3ombw@mail.gmail.com/
      Link: https://lkml.kernel.org/r/20220907144521.3115321-3-zokeefe@google.com
      Link: https://lkml.kernel.org/r/20220922224046.1143204-3-zokeefe@google.comSigned-off-by: NZach O'Keefe <zokeefe@google.com>
      Reviewed-by: NYang Shi <shy828301@gmail.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: Chris Kennelly <ckennelly@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: James Houghton <jthoughton@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
      Cc: SeongJae Park <sj@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      58ac9a89
    • L
      mm/huge_memory: prevent THP_ZERO_PAGE_ALLOC increased twice · f4981502
      Liu Shixin 提交于
      A user who reads THP_ZERO_PAGE_ALLOC may be more concerned about the huge
      zero pages that are really allocated for thp.  It is misleading to
      increase THP_ZERO_PAGE_ALLOC twice if two threads call get_huge_zero_page
      concurrently.  Don't increase the value if the huge page is not really
      used.
      
      Update Documentation/admin-guide/mm/transhuge.rst to suit.
      
      Link: https://lkml.kernel.org/r/20220909021653.3371879-1-liushixin2@huawei.comSigned-off-by: NLiu Shixin <liushixin2@huawei.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      f4981502
    • S
      Docs/admin-guide/mm/damon/usage: note DAMON debugfs interface deprecation plan · f1f3afd5
      SeongJae Park 提交于
      Commit b1840272 ("Docs/admin-guide/mm/damon/usage: document DAMON
      sysfs interface") announced the DAMON debugfs interface deprecation plan,
      but it is not so aggressively announced.  As the deprecation time is
      coming, this commit makes the announce more easy to be found by adding the
      note at the beginning of the DAMON debugfs interface usage document.
      
      Link: https://lkml.kernel.org/r/20220909202901.57977-8-sj@kernel.orgSigned-off-by: NSeongJae Park <sj@kernel.org>
      Cc: Brendan Higgins <brendanhiggins@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Yun Levi <ppbuk5246@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      f1f3afd5
    • S
      Docs/admin-guide/mm/damon/start: mention the dependency as sysfs instead of debugfs · 04cc7e4b
      SeongJae Park 提交于
      'Getting Started' document of DAMON says DAMON user-space tool, damo[1],
      is using DAMON debugfs interface, and therefore it needs to ensure debugfs
      is mounted.  However, the latest version of the tool is using DAMON sysfs
      interface.  Moreover, DAMON debugfs interface is going to be deprecated as
      announced by commit b1840272 ("Docs/admin-guide/mm/damon/usage:
      document DAMON sysfs interface").
      
      This commit therefore update the document to tell readers about DAMON
      sysfs interface dependency instead and never mention about debugfs
      interface, which will be deprecated.
      
      [1] https://github.com/awslabs/damo
      
      Link: https://lkml.kernel.org/r/20220909202901.57977-7-sj@kernel.orgSigned-off-by: NSeongJae Park <sj@kernel.org>
      Cc: Brendan Higgins <brendanhiggins@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Yun Levi <ppbuk5246@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      04cc7e4b
    • S
      Docs/admin-guide/mm/damon: rename the title of the document · 0ff11f10
      SeongJae Park 提交于
      The title of the DAMON document for admin-guide, 'Monitoring Data
      Accesses', could confuse readers in some ways.  First of all, DAMON is not
      the only single way for data access monitoring.  And the document is for
      not only the data access monitoring but also data access pattern based
      memory management optimizations (DAMOS).  This commit updates the title to
      'DAMON: Data Access MONitor', which more explicitly explains what the
      document describes.
      
      Link: https://lkml.kernel.org/r/20220909202901.57977-5-sj@kernel.org
      Fixes: c4ba6014 ("Documentation: add documents for DAMON")
      Signed-off-by: NSeongJae Park <sj@kernel.org>
      Cc: Brendan Higgins <brendanhiggins@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Yun Levi <ppbuk5246@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      0ff11f10
  11. 01 10月, 2022 1 次提交
  12. 28 9月, 2022 6 次提交
    • R
      ACPI: docs: Drop useless DSDT override documentation · d206cef0
      Rafael J. Wysocki 提交于
      Because https://01.org/linux-acpi web site has become permanently
      inaccessible, the "Overriding DSDT" document in the kernel tree
      pointing to it as the main source of information is useless (and
      the config option name mentioned by it is incorrect), so drop it
      and drop the pointer to it from the ACPI Kconfig.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      d206cef0
    • H
    • L
      Documentation/hw-vuln: Update spectre doc · 06cb31cc
      Lin Yujun 提交于
      commit 7c693f54 ("x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS")
      
      adds the "ibrs " option  in
      Documentation/admin-guide/kernel-parameters.txt but omits it to
      Documentation/admin-guide/hw-vuln/spectre.rst, add it.
      Signed-off-by: NLin Yujun <linyujun809@huawei.com>
      Link: https://lore.kernel.org/r/20220830123614.23007-1-linyujun809@huawei.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
      06cb31cc
    • L
      docs: admin-guide: for kernel bugs refer to other kernel documentation · 32a3a9db
      Lukas Bulwahn 提交于
      The current section 'If something goes wrong' makes a number of suggestions
      for debugging, bug hunting and reporting issues, which are quite briefly
      described in that section.
      
      However, the suggestions are also well covered in other kernel
      documentation or sometimes simply outdated. Here, each suggestion in that
      section is summarized, and then followed with its assessment, and the
      derived action for each suggestion:
      
        - use MAINTAINERS and mailing list: covered in 'Reporting issues',
          summarized in the short guide, detailed in its further section.
          Reporting issues even provides some specific examples that guides
          readers well through the needed steps. Refer to 'Reporting issues'.
      
        - contact Linus Torvalds: probably outdated as currently described.
          nevertheless covered in 'Reporting issues'. Reporting issues points out
          to contact the relevant kernel maintainers first, and after some
          patience and failed attempts with those maintainers, contacting Linus
          Torvalds might be okay. Refer to 'Reporting issues'.
      
        - tell what kernel, how to duplicate, the setup, if the problem is new
          or old and when did you notice: covered in 'Reporting issues',
          especially in Step-by-step guide how to report issues to the kernel
          maintainers. Refer to 'Reporting issues'.
      
        - duplicate kernel bug reports exactly: covered in 'Reporting issues',
          especially in Write and send the report. Refer to 'Reporting issues'.
      
        - read 'Bug hunting': keep this reference. Refer to 'Bug hunting'.
      
        - compile the kernel with CONFIG_KALLSYMS: covered in 'Reporting issues',
          especially in Decode failure messages. Refer to 'Reporting issues'.
      
        - alternatively, use ksymoops: ksymoops at the mentioned URL seems not to
          be maintained anymore. It was released roughly once a year until
          version 2.4.11 in 2005, but has not seen a new release since then. The
          information in ./scripts/ksymoops/README is from 1999, and does not
          give more insight on its actual maintenance state either. Ksymoops is
          mentioned as system utility in changes.rst, but also not recommended
          there. Drop the explanation on using ksymoops.
      
        - alternatively, lookup dump manually with the EIP and nm to determine
          the function in which the kernel crashes: this method seems already a
          quite advanced and low-level debugging method. Even all the further
          references on bug hunting and debugging do not mention it. Drop this
          alternative method and limit mentioning methods explained in the other
          existing kernel documentation.
      
        - read 'Reporting issues': keep this reference.
          Refer to 'Reporting issues'.
      
        - use gdb for debugging: some specific details, e.g., edit
          arch/x86/Makefile, are probably outdated or limited to one (historic
          important) setup. Using gdb is covered in 'Bug hunting', 'Debugging
          kernel and modules via gdb' and 'Using kgdb, kdb and the kernel
          debugger internals'. Refer to those three documents.
      
      Overall, it is sufficient to refer to reporting-issues.rst,
      bug-hunting.rst, gdb-kernel-debugging.rst and kgdb.rst and this way cover
      the existing suggestions.
      
      'Reporting issues' is quite new and probably up to date. 'Bug hunting',
      'Debugging kernel and modules via gdb' and 'Using kgdb, kdb and the kernel
      debugger internals' might need some revisit and update, but they are
      generally in an acceptable state for referring to them.
      
      Replace the existing suggestions by reference to other existing kernel
      documentation covering those suggestions---partly even nicely summarized
      and then explained in greater detail.
      Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
      Link: https://lore.kernel.org/r/20220720041325.15693-3-lukas.bulwahn@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
      32a3a9db
    • L
      docs: admin-guide: do not mention the 'run a.out user programs' feature · 3f10b508
      Lukas Bulwahn 提交于
      Running a.out user programs with the latest kernel release is a very rare
      and uncommon use case nowadays. The support of a.out user programs is only
      remaining for the alpha architecture and is not defined and activated in
      the architecture's Kconfig (so even the activation of this support requires
      to modify the Kconfig file and not just kernel build configuration).
      
      The discussion on a.out support in 2019 (see Link) shows that the support
      of a.out user programs is just remaining for a special corner case from
      some (alpha architecture) users.
      
      There is no need to point out and mention this special feature to the
      general audience of kernel users. Delete the reference to this historic and
      special feature.
      
      Link: https://lore.kernel.org/all/CAHk-=wgt7M6yA5BJCJo0nF22WgPJnN8CvViL9CAJmd+S+Civ6w@mail.gmail.com/Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
      Link: https://lore.kernel.org/r/20220720041325.15693-2-lukas.bulwahn@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
      3f10b508
    • A
      Remove duplicate words inside documentation · d2bef8e1
      Akhil Raj 提交于
      I have removed repeated `the` inside the documentation
      Signed-off-by: NAkhil Raj <lf32.dev@gmail.com>
      Link: https://lore.kernel.org/r/20220827145359.32599-1-lf32.dev@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>
      d2bef8e1
  13. 27 9月, 2022 2 次提交
  14. 26 9月, 2022 1 次提交
  15. 22 9月, 2022 1 次提交
  16. 17 9月, 2022 1 次提交
  17. 16 9月, 2022 1 次提交
  18. 12 9月, 2022 5 次提交
  19. 10 9月, 2022 1 次提交
  20. 09 9月, 2022 2 次提交