• V
    mm, page_owner, debug_pagealloc: save and dump freeing stack trace · 8974558f
    Vlastimil Babka 提交于
    The debug_pagealloc functionality is useful to catch buggy page allocator
    users that cause e.g.  use after free or double free.  When page
    inconsistency is detected, debugging is often simpler by knowing the call
    stack of process that last allocated and freed the page.  When page_owner
    is also enabled, we record the allocation stack trace, but not freeing.
    
    This patch therefore adds recording of freeing process stack trace to page
    owner info, if both page_owner and debug_pagealloc are configured and
    enabled.  With only page_owner enabled, this info is not useful for the
    memory leak debugging use case.  dump_page() is adjusted to print the
    info.  An example result of calling __free_pages() twice may look like
    this (note the page last free stack trace):
    
    BUG: Bad page state in process bash  pfn:13d8f8
    page:ffffc31984f63e00 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0
    flags: 0x1affff800000000()
    raw: 01affff800000000 dead000000000100 dead000000000122 0000000000000000
    raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
    page dumped because: nonzero _refcount
    page_owner tracks the page as freed
    page last allocated via order 0, migratetype Unmovable, gfp_mask 0xcc0(GFP_KERNEL)
     prep_new_page+0x143/0x150
     get_page_from_freelist+0x289/0x380
     __alloc_pages_nodemask+0x13c/0x2d0
     khugepaged+0x6e/0xc10
     kthread+0xf9/0x130
     ret_from_fork+0x3a/0x50
    page last free stack trace:
     free_pcp_prepare+0x134/0x1e0
     free_unref_page+0x18/0x90
     khugepaged+0x7b/0xc10
     kthread+0xf9/0x130
     ret_from_fork+0x3a/0x50
    Modules linked in:
    CPU: 3 PID: 271 Comm: bash Not tainted 5.3.0-rc4-2.g07a1a73-default+ #57
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
    Call Trace:
     dump_stack+0x85/0xc0
     bad_page.cold+0xba/0xbf
     rmqueue_pcplist.isra.0+0x6c5/0x6d0
     rmqueue+0x2d/0x810
     get_page_from_freelist+0x191/0x380
     __alloc_pages_nodemask+0x13c/0x2d0
     __get_free_pages+0xd/0x30
     __pud_alloc+0x2c/0x110
     copy_page_range+0x4f9/0x630
     dup_mmap+0x362/0x480
     dup_mm+0x68/0x110
     copy_process+0x19e1/0x1b40
     _do_fork+0x73/0x310
     __x64_sys_clone+0x75/0x80
     do_syscall_64+0x6e/0x1e0
     entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x7f10af854a10
    ...
    
    Link: http://lkml.kernel.org/r/20190820131828.22684-5-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
    Cc: Kirill A. Shutemov <kirill@shutemov.name>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@kernel.org>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    8974558f
Kconfig.debug 4.7 KB
# SPDX-License-Identifier: GPL-2.0-only
config PAGE_EXTENSION
	bool "Extend memmap on extra space for more information on page"
	---help---
	  Extend memmap on extra space for more information on page. This
	  could be used for debugging features that need to insert extra
	  field for every page. This extension enables us to save memory
	  by not allocating this extra memory according to boottime
	  configuration.

config DEBUG_PAGEALLOC
	bool "Debug page memory allocations"
	depends on DEBUG_KERNEL
	depends on !HIBERNATION || ARCH_SUPPORTS_DEBUG_PAGEALLOC && !PPC && !SPARC
	select PAGE_POISONING if !ARCH_SUPPORTS_DEBUG_PAGEALLOC
	---help---
	  Unmap pages from the kernel linear mapping after free_pages().
	  Depending on runtime enablement, this results in a small or large
	  slowdown, but helps to find certain types of memory corruption.

	  Also, the state of page tracking structures is checked more often as
	  pages are being allocated and freed, as unexpected state changes
	  often happen for same reasons as memory corruption (e.g. double free,
	  use-after-free). The error reports for these checks can be augmented
	  with stack traces of last allocation and freeing of the page, when
	  PAGE_OWNER is also selected and enabled on boot.

	  For architectures which don't enable ARCH_SUPPORTS_DEBUG_PAGEALLOC,
	  fill the pages with poison patterns after free_pages() and verify
	  the patterns before alloc_pages(). Additionally, this option cannot
	  be enabled in combination with hibernation as that would result in
	  incorrect warnings of memory corruption after a resume because free
	  pages are not saved to the suspend image.

	  By default this option will have a small overhead, e.g. by not
	  allowing the kernel mapping to be backed by large pages on some
	  architectures. Even bigger overhead comes when the debugging is
	  enabled by DEBUG_PAGEALLOC_ENABLE_DEFAULT or the debug_pagealloc
	  command line parameter.

config DEBUG_PAGEALLOC_ENABLE_DEFAULT
	bool "Enable debug page memory allocations by default?"
	depends on DEBUG_PAGEALLOC
	---help---
	  Enable debug page memory allocations by default? This value
	  can be overridden by debug_pagealloc=off|on.

config PAGE_OWNER
	bool "Track page owner"
	depends on DEBUG_KERNEL && STACKTRACE_SUPPORT
	select DEBUG_FS
	select STACKTRACE
	select STACKDEPOT
	select PAGE_EXTENSION
	help
	  This keeps track of what call chain is the owner of a page, may
	  help to find bare alloc_page(s) leaks. Even if you include this
	  feature on your build, it is disabled in default. You should pass
	  "page_owner=on" to boot parameter in order to enable it. Eats
	  a fair amount of memory if enabled. See tools/vm/page_owner_sort.c
	  for user-space helper.

	  If unsure, say N.

config PAGE_POISONING
	bool "Poison pages after freeing"
	select PAGE_POISONING_NO_SANITY if HIBERNATION
	---help---
	  Fill the pages with poison patterns after free_pages() and verify
	  the patterns before alloc_pages. The filling of the memory helps
	  reduce the risk of information leaks from freed data. This does
	  have a potential performance impact if enabled with the
	  "page_poison=1" kernel boot option.

	  Note that "poison" here is not the same thing as the "HWPoison"
	  for CONFIG_MEMORY_FAILURE. This is software poisoning only.

	  If unsure, say N

config PAGE_POISONING_NO_SANITY
	depends on PAGE_POISONING
	bool "Only poison, don't sanity check"
	---help---
	   Skip the sanity checking on alloc, only fill the pages with
	   poison on free. This reduces some of the overhead of the
	   poisoning feature.

	   If you are only interested in sanitization, say Y. Otherwise
	   say N.

config PAGE_POISONING_ZERO
	bool "Use zero for poisoning instead of debugging value"
	depends on PAGE_POISONING
	---help---
	   Instead of using the existing poison value, fill the pages with
	   zeros. This makes it harder to detect when errors are occurring
	   due to sanitization but the zeroing at free means that it is
	   no longer necessary to write zeros when GFP_ZERO is used on
	   allocation.

	   If unsure, say N

config DEBUG_PAGE_REF
	bool "Enable tracepoint to track down page reference manipulation"
	depends on DEBUG_KERNEL
	depends on TRACEPOINTS
	---help---
	  This is a feature to add tracepoint for tracking down page reference
	  manipulation. This tracking is useful to diagnose functional failure
	  due to migration failures caused by page reference mismatches.  Be
	  careful when enabling this feature because it adds about 30 KB to the
	  kernel code.  However the runtime performance overhead is virtually
	  nil until the tracepoints are actually enabled.

config DEBUG_RODATA_TEST
    bool "Testcase for the marking rodata read-only"
    depends on STRICT_KERNEL_RWX
    ---help---
      This option enables a testcase for the setting rodata read-only.
反馈
建议
客服 返回
顶部