Merge branch 'mm-rst' into docs-next

Mike Rapoport says: These patches convert files in Documentation/vm to ReST format, add an initial index and link it to the top level documentation. There are no contents changes in the documentation, except few spelling fixes. The relatively large diffstat stems from the indentation and paragraph wrapping changes. I've tried to keep the formatting as consistent as possible, but I could miss some places that needed markup and add some markup where it was not necessary. [jc: significant conflicts in vm/hmm.rst]

Merge branch 'mm-rst' into docs-next
Mike Rapoport says: These patches convert files in Documentation/vm to ReST format, add an initial index and link it to the top level documentation. There are no contents changes in the documentation, except few spelling fixes. The relatively large diffstat stems from the indentation and paragraph wrapping changes. I've tried to keep the formatting as consistent as possible, but I could miss some places that needed markup and add some markup where it was not necessary. [jc: significant conflicts in vm/hmm.rst]
24844fd3 · Jonathan Corbet · 32fb7ef6 · 82381918 · 24844fd3 · 24844fd3
68 changed file
--- a/Documentation/ABI/stable/sysfs-devices-node
+++ b/Documentation/ABI/stable/sysfs-devices-node
@@ -90,4 +90,4 @@ Date:		December 2009
 Contact:	Lee Schermerhorn <lee.schermerhorn@hp.com>
 Description:
 		The node's huge page size control/query attributes.
-		See Documentation/vm/hugetlbpage.txt
+		See Documentation/vm/hugetlbpage.rst
\ No newline at end of file
--- a/Documentation/ABI/testing/sysfs-kernel-mm-hugepages
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-hugepages
@@ -12,4 +12,4 @@ Description:
 			free_hugepages
 			surplus_hugepages
 			resv_hugepages
-		See Documentation/vm/hugetlbpage.txt for details.
+		See Documentation/vm/hugetlbpage.rst for details.
--- a/Documentation/ABI/testing/sysfs-kernel-mm-ksm
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-ksm
@@ -40,7 +40,7 @@ Description:	Kernel Samepage Merging daemon sysfs interface
 		sleep_millisecs: how many milliseconds ksm should sleep between
 		scans.
-		See Documentation/vm/ksm.txt for more information.
+		See Documentation/vm/ksm.rst for more information.
 What:		/sys/kernel/mm/ksm/merge_across_nodes
 Date:		January 2013

--- a/Documentation/ABI/testing/sysfs-kernel-slab
+++ b/Documentation/ABI/testing/sysfs-kernel-slab
@@ -37,7 +37,7 @@ Description:
 		The alloc_calls file is read-only and lists the kernel code
 		locations from which allocations for this cache were performed.
 		The alloc_calls file only contains information if debugging is
-		enabled for that cache (see Documentation/vm/slub.txt).
+		enabled for that cache (see Documentation/vm/slub.rst).
 What:		/sys/kernel/slab/cache/alloc_fastpath
 Date:		February 2008
@@ -219,7 +219,7 @@ Contact:	Pekka Enberg <penberg@cs.helsinki.fi>,
 Description:
 		The free_calls file is read-only and lists the locations of
 		object frees if slab debugging is enabled (see
-		Documentation/vm/slub.txt).
+		Documentation/vm/slub.rst).
 What:		/sys/kernel/slab/cache/free_fastpath
 Date:		February 2008

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3915,7 +3915,7 @@
 			cache (risks via metadata attacks are mostly
 			unchanged). Debug options disable merging on their
 			own.
-			For more information see Documentation/vm/slub.txt.
+			For more information see Documentation/vm/slub.rst.
 	slab_max_order=	[MM, SLAB]
 			Determines the maximum allowed order for slabs.
@@ -3929,7 +3929,7 @@
 			slub_debug can create guard zones around objects and
 			may poison objects when not in use. Also tracks the
 			last alloc / free. For more information see
-			Documentation/vm/slub.txt.
+			Documentation/vm/slub.rst.
 	slub_memcg_sysfs=	[MM, SLUB]
 			Determines whether to enable sysfs directories for
@@ -3943,7 +3943,7 @@
 			Determines the maximum allowed order for slabs.
 			A high setting may cause OOMs due to memory
 			fragmentation. For more information see
-			Documentation/vm/slub.txt.
+			Documentation/vm/slub.rst.
 	slub_min_objects=	[MM, SLUB]
 			The minimum number of objects per slab. SLUB will
@@ -3952,12 +3952,12 @@
 			the number of objects indicated. The higher the number
 			of objects the smaller the overhead of tracking slabs
 			and the less frequently locks need to be acquired.
-			For more information see Documentation/vm/slub.txt.
+			For more information see Documentation/vm/slub.rst.
 	slub_min_order=	[MM, SLUB]
 			Determines the minimum page order for slabs. Must be
 			lower than slub_max_order.
-			For more information see Documentation/vm/slub.txt.
+			For more information see Documentation/vm/slub.rst.
 	slub_nomerge	[MM, SLUB]
 			Same with slab_nomerge. This is supported for legacy.
@@ -4313,7 +4313,7 @@
 			Format: [always|madvise|never]
 			Can be used to control the default behavior of the system
 			with respect to transparent hugepages.
-			See Documentation/vm/transhuge.txt for more details.
+			See Documentation/vm/transhuge.rst for more details.
 	tsc=		Disable clocksource stability checks for TSC.
 			Format: <string>

--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -120,7 +120,7 @@ A typical out of bounds access report looks like this::
 The header of the report discribe what kind of bug happened and what kind of
 access caused it. It's followed by the description of the accessed slub object
-(see 'SLUB Debug output' section in Documentation/vm/slub.txt for details) and
+(see 'SLUB Debug output' section in Documentation/vm/slub.rst for details) and
 the description of the accessed memory page.
 In the last section the report shows memory state around the accessed address.

--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -515,7 +515,7 @@ guarantees:
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
-soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
+soft-dirty bit on pte (see Documentation/vm/soft-dirty.rst for details).
 To clear the bits for all the pages associated with the process
    > echo 1 > /proc/PID/clear_refs
@@ -536,7 +536,7 @@ Any other value written to /proc/PID/clear_refs will have no effect.
 The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags
 using /proc/kpageflags and number of times a page is mapped using
-/proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.txt.
+/proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.rst.
 The /proc/pid/numa_maps is an extension based on maps, showing the memory
 locality and binding policy, as well as the memory usage (in pages) of

--- a/Documentation/filesystems/tmpfs.txt
+++ b/Documentation/filesystems/tmpfs.txt
@@ -105,7 +105,7 @@ policy for the file will revert to "default" policy.
 NUMA memory allocation policies have optional flags that can be used in
 conjunction with their modes.  These optional flags can be specified
 when tmpfs is mounted by appending them to the mode before the NodeList.
-See Documentation/vm/numa_memory_policy.txt for a list of all available
+See Documentation/vm/numa_memory_policy.rst for a list of all available
 memory allocation policy mode flags and their effect on memory policy.
 	=static		is equivalent to	MPOL_F_STATIC_NODES

--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -89,6 +89,7 @@ needed).
   sound/index
   crypto/index
   filesystems/index
+   vm/index
 Architecture-specific documentation
 -----------------------------------

--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -515,7 +515,7 @@ nr_hugepages
 Change the minimum size of the hugepage pool.
-See Documentation/vm/hugetlbpage.txt
+See Documentation/vm/hugetlbpage.rst
 ==============================================================
@@ -524,7 +524,7 @@ nr_overcommit_hugepages
 Change the maximum size of the hugepage pool. The maximum is
 nr_hugepages + nr_overcommit_hugepages.
-See Documentation/vm/hugetlbpage.txt
+See Documentation/vm/hugetlbpage.rst
 ==============================================================
@@ -667,7 +667,7 @@ and don't use much of it.
 The default value is 0.
-See Documentation/vm/overcommit-accounting and
+See Documentation/vm/overcommit-accounting.rst and
 mm/mmap.c::__vm_enough_memory() for more information.
 ==============================================================

--- a/Documentation/vm/00-INDEX
+++ b/Documentation/vm/00-INDEX
 00-INDEX
 	- this file.
-active_mm.txt
+active_mm.rst
 	- An explanation from Linus about tsk->active_mm vs tsk->mm.
-balance
+balance.rst
 	- various information on memory balancing.
-cleancache.txt
+cleancache.rst
 	- Intro to cleancache and page-granularity victim cache.
-frontswap.txt
+frontswap.rst
 	- Outline frontswap, part of the transcendent memory frontend.
-highmem.txt
+highmem.rst
 	- Outline of highmem and common issues.
-hmm.txt
+hmm.rst
 	- Documentation of heterogeneous memory management
-hugetlbpage.txt
+hugetlbpage.rst
 	- a brief summary of hugetlbpage support in the Linux kernel.
-hugetlbfs_reserv.txt
+hugetlbfs_reserv.rst
 	- A brief overview of hugetlbfs reservation design/implementation.
-hwpoison.txt
+hwpoison.rst
 	- explains what hwpoison is
-idle_page_tracking.txt
+idle_page_tracking.rst
 	- description of the idle page tracking feature.
-ksm.txt
+ksm.rst
 	- how to use the Kernel Samepage Merging feature.
-mmu_notifier.txt
+mmu_notifier.rst
 	- a note about clearing pte/pmd and mmu notifications
-numa
+numa.rst
 	- information about NUMA specific code in the Linux vm.
-numa_memory_policy.txt
+numa_memory_policy.rst
 	- documentation of concepts and APIs of the 2.6 memory policy support.
-overcommit-accounting
+overcommit-accounting.rst
 	- description of the Linux kernels overcommit handling modes.
-page_frags
+page_frags.rst
 	- description of page fragments allocator
-page_migration
+page_migration.rst
 	- description of page migration in NUMA systems.
-pagemap.txt
+pagemap.rst
 	- pagemap, from the userspace perspective
-page_owner.txt
+page_owner.rst
 	- tracking about who allocated each page
-remap_file_pages.txt
+remap_file_pages.rst
 	- a note about remap_file_pages() system call
-slub.txt
+slub.rst
 	- a short users guide for SLUB.
-soft-dirty.txt
+soft-dirty.rst
 	- short explanation for soft-dirty PTEs
-split_page_table_lock
+split_page_table_lock.rst
 	- Separate per-table lock to improve scalability of the old page_table_lock.
-swap_numa.txt
+swap_numa.rst
 	- automatic binding of swap device to numa node
-transhuge.txt
+transhuge.rst
 	- Transparent Hugepage Support, alternative way of using hugepages.
-unevictable-lru.txt
+unevictable-lru.rst
 	- Unevictable LRU infrastructure
-userfaultfd.txt
+userfaultfd.rst
 	- description of userfaultfd system call
 z3fold.txt
 	- outline of z3fold allocator for storing compressed pages
-zsmalloc.txt
+zsmalloc.rst
 	- outline of zsmalloc allocator for storing compressed pages
-zswap.txt
+zswap.rst
 	- Intro to compressed cache for swap pages
--- a/Documentation/vm/active_mm.rst
+++ b/Documentation/vm/active_mm.rst
+.. _active_mm:
+=========
+Active MM
+=========
+::
+ List:       linux-kernel
+ Subject:    Re: active_mm
+ From:       Linus Torvalds <torvalds () transmeta ! com>
+ Date:       1999-07-30 21:36:24
+ Cc'd to linux-kernel, because I don't write explanations all that often,
+ and when I do I feel better about more people reading them.
+ On Fri, 30 Jul 1999, David Mosberger wrote:
+ >
+ > Is there a brief description someplace on how "mm" vs. "active_mm" in
+ > the task_struct are supposed to be used?  (My apologies if this was
+ > discussed on the mailing lists---I just returned from vacation and
+ > wasn't able to follow linux-kernel for a while).
+ Basically, the new setup is:
+  - we have "real address spaces" and "anonymous address spaces". The
+    difference is that an anonymous address space doesn't care about the
+    user-level page tables at all, so when we do a context switch into an
+    anonymous address space we just leave the previous address space
+    active.
+    The obvious use for a "anonymous address space" is any thread that
+    doesn't need any user mappings - all kernel threads basically fall into
+    this category, but even "real" threads can temporarily say that for
+    some amount of time they are not going to be interested in user space,
+    and that the scheduler might as well try to avoid wasting time on
+    switching the VM state around. Currently only the old-style bdflush
+    sync does that.
+  - "tsk->mm" points to the "real address space". For an anonymous process,
+    tsk->mm will be NULL, for the logical reason that an anonymous process
+    really doesn't _have_ a real address space at all.
+  - however, we obviously need to keep track of which address space we
+    "stole" for such an anonymous user. For that, we have "tsk->active_mm",
+    which shows what the currently active address space is.
+    The rule is that for a process with a real address space (ie tsk->mm is
+    non-NULL) the active_mm obviously always has to be the same as the real
+    one.
+    For a anonymous process, tsk->mm == NULL, and tsk->active_mm is the
+    "borrowed" mm while the anonymous process is running. When the
+    anonymous process gets scheduled away, the borrowed address space is
+    returned and cleared.
+ To support all that, the "struct mm_struct" now has two counters: a
+ "mm_users" counter that is how many "real address space users" there are,
+ and a "mm_count" counter that is the number of "lazy" users (ie anonymous
+ users) plus one if there are any real users.
+ Usually there is at least one real user, but it could be that the real
+ user exited on another CPU while a lazy user was still active, so you do
+ actually get cases where you have a address space that is _only_ used by
+ lazy users. That is often a short-lived state, because once that thread
+ gets scheduled away in favour of a real thread, the "zombie" mm gets
+ released because "mm_users" becomes zero.
+ Also, a new rule is that _nobody_ ever has "init_mm" as a real MM any
+ more. "init_mm" should be considered just a "lazy context when no other
+ context is available", and in fact it is mainly used just at bootup when
+ no real VM has yet been created. So code that used to check
+ 	if (current->mm == &init_mm)
+ should generally just do
+ 	if (!current->mm)
+ instead (which makes more sense anyway - the test is basically one of "do
+ we have a user context", and is generally done by the page fault handler
+ and things like that).
+ Anyway, I put a pre-patch-2.3.13-1 on ftp.kernel.org just a moment ago,
+ because it slightly changes the interfaces to accommodate the alpha (who
+ would have thought it, but the alpha actually ends up having one of the
+ ugliest context switch codes - unlike the other architectures where the MM
+ and register state is separate, the alpha PALcode joins the two, and you
+ need to switch both together).
+ (From http://marc.info/?l=linux-kernel&m=93337278602211&w=2)
--- a/Documentation/vm/active_mm.txt
+++ b/Documentation/vm/active_mm.txt
-List:       linux-kernel
-Subject:    Re: active_mm
-From:       Linus Torvalds <torvalds () transmeta ! com>
-Date:       1999-07-30 21:36:24
-Cc'd to linux-kernel, because I don't write explanations all that often,
-and when I do I feel better about more people reading them.
-On Fri, 30 Jul 1999, David Mosberger wrote:
->
-> Is there a brief description someplace on how "mm" vs. "active_mm" in
-> the task_struct are supposed to be used?  (My apologies if this was
-> discussed on the mailing lists---I just returned from vacation and
-> wasn't able to follow linux-kernel for a while).
-Basically, the new setup is:
- - we have "real address spaces" and "anonymous address spaces". The
-   difference is that an anonymous address space doesn't care about the
-   user-level page tables at all, so when we do a context switch into an
-   anonymous address space we just leave the previous address space
-   active.
-   The obvious use for a "anonymous address space" is any thread that
-   doesn't need any user mappings - all kernel threads basically fall into
-   this category, but even "real" threads can temporarily say that for
-   some amount of time they are not going to be interested in user space,
-   and that the scheduler might as well try to avoid wasting time on
-   switching the VM state around. Currently only the old-style bdflush
-   sync does that.
- - "tsk->mm" points to the "real address space". For an anonymous process,
-   tsk->mm will be NULL, for the logical reason that an anonymous process
-   really doesn't _have_ a real address space at all.
- - however, we obviously need to keep track of which address space we
-   "stole" for such an anonymous user. For that, we have "tsk->active_mm",
-   which shows what the currently active address space is.
-   The rule is that for a process with a real address space (ie tsk->mm is
-   non-NULL) the active_mm obviously always has to be the same as the real
-   one.
-   For a anonymous process, tsk->mm == NULL, and tsk->active_mm is the
-   "borrowed" mm while the anonymous process is running. When the
-   anonymous process gets scheduled away, the borrowed address space is
-   returned and cleared.
-To support all that, the "struct mm_struct" now has two counters: a
-"mm_users" counter that is how many "real address space users" there are,
-and a "mm_count" counter that is the number of "lazy" users (ie anonymous
-users) plus one if there are any real users.
-Usually there is at least one real user, but it could be that the real
-user exited on another CPU while a lazy user was still active, so you do
-actually get cases where you have a address space that is _only_ used by
-lazy users. That is often a short-lived state, because once that thread
-gets scheduled away in favour of a real thread, the "zombie" mm gets
-released because "mm_users" becomes zero.
-Also, a new rule is that _nobody_ ever has "init_mm" as a real MM any
-more. "init_mm" should be considered just a "lazy context when no other
-context is available", and in fact it is mainly used just at bootup when
-no real VM has yet been created. So code that used to check
-	if (current->mm == &init_mm)
-should generally just do
-	if (!current->mm)
-instead (which makes more sense anyway - the test is basically one of "do
-we have a user context", and is generally done by the page fault handler
-and things like that).
-Anyway, I put a pre-patch-2.3.13-1 on ftp.kernel.org just a moment ago,
-because it slightly changes the interfaces to accommodate the alpha (who
-would have thought it, but the alpha actually ends up having one of the
-ugliest context switch codes - unlike the other architectures where the MM
-and register state is separate, the alpha PALcode joins the two, and you
-need to switch both together).
-(From http://marc.info/?l=linux-kernel&m=93337278602211&w=2)
--- a/Documentation/vm/balance
+++ b/Documentation/vm/balance
+.. _balance:
+================
+Memory Balancing
+================
 Started Jan 2000 by Kanoj Sarcar <kanoj@sgi.com>
 Memory balancing is needed for !__GFP_ATOMIC and !__GFP_KSWAPD_RECLAIM as
@@ -89,7 +95,8 @@ pages is below watermark[WMARK_LOW]; in which case zone_wake_kswapd is also set.
 (Good) Ideas that I have heard:
 1. Dynamic experience should influence balancing: number of failed requests
-for a zone can be tracked and fed into the balancing scheme (jalvo@mbay.net)
+   for a zone can be tracked and fed into the balancing scheme (jalvo@mbay.net)
 2. Implement a replace_with_highmem()-like replace_with_regular() to preserve
-dma pages. (lkd@tantalophile.demon.co.uk)
+   dma pages. (lkd@tantalophile.demon.co.uk)
--- a/Documentation/vm/cleancache.txt
+++ b/Documentation/vm/cleancache.txt
-MOTIVATION
+.. _cleancache:
+==========
+Cleancache
+==========
+Motivation
+==========
 Cleancache is a new optional feature provided by the VFS layer that
 potentially dramatically increases page cache effectiveness for
@@ -21,9 +28,10 @@ Transcendent memory "drivers" for cleancache are currently implemented
 in Xen (using hypervisor memory) and zcache (using in-kernel compressed
 memory) and other implementations are in development.
-FAQs are included below.
+:ref:`FAQs <faq>` are included below.
-IMPLEMENTATION OVERVIEW
+Implementation Overview
+=======================
 A cleancache "backend" that provides transcendent memory registers itself
 to the kernel's cleancache "frontend" by calling cleancache_register_ops,
@@ -80,22 +88,33 @@ different Linux threads are simultaneously putting and invalidating a page
 with the same handle, the results are indeterminate.  Callers must
 lock the page to ensure serial behavior.
-CLEANCACHE PERFORMANCE METRICS
+Cleancache Performance Metrics
+==============================
 If properly configured, monitoring of cleancache is done via debugfs in
-the /sys/kernel/debug/cleancache directory.  The effectiveness of cleancache
+the `/sys/kernel/debug/cleancache` directory.  The effectiveness of cleancache
 can be measured (across all filesystems) with:
-succ_gets	- number of gets that were successful
+``succ_gets``
-failed_gets	- number of gets that failed
+	number of gets that were successful
-puts		- number of puts attempted (all "succeed")
-invalidates	- number of invalidates attempted
+``failed_gets``
+	number of gets that failed
+``puts``
+	number of puts attempted (all "succeed")
+``invalidates``
+	number of invalidates attempted
 A backend implementation may provide additional metrics.
+.. _faq:
 FAQ
+===
-1) Where's the value? (Andrew Morton)
+* Where's the value? (Andrew Morton)
 Cleancache provides a significant performance benefit to many workloads
 in many environments with negligible overhead by improving the
@@ -137,7 +156,7 @@ device that stores pages of data in a compressed state.  And
 the proposed "RAMster" driver shares RAM across multiple physical
 systems.
-2) Why does cleancache have its sticky fingers so deep inside the
+* Why does cleancache have its sticky fingers so deep inside the
  filesystems and VFS? (Andrew Morton and Christoph Hellwig)
 The core hooks for cleancache in VFS are in most cases a single line
@@ -168,9 +187,9 @@ filesystems in the future.
 The total impact of the hooks to existing fs and mm files is only
 about 40 lines added (not counting comments and blank lines).
-3) Why not make cleancache asynchronous and batched so it can
+* Why not make cleancache asynchronous and batched so it can more
-   more easily interface with real devices with DMA instead
+  easily interface with real devices with DMA instead of copying each
-   of copying each individual page? (Minchan Kim)
+  individual page? (Minchan Kim)
 The one-page-at-a-time copy semantics simplifies the implementation
 on both the frontend and backend and also allows the backend to
@@ -182,7 +201,7 @@ are avoided.  While the interface seems odd for a "real device"
 or for real kernel-addressable RAM, it makes perfect sense for
 transcendent memory.
-4) Why is non-shared cleancache "exclusive"?  And where is the
+* Why is non-shared cleancache "exclusive"?  And where is the
  page "invalidated" after a "get"? (Minchan Kim)
 The main reason is to free up space in transcendent memory and
@@ -193,7 +212,7 @@ be easily extended to add a "get_no_invalidate" call.
 The invalidate is done by the cleancache backend implementation.
-5) What's the performance impact?
+* What's the performance impact?
 Performance analysis has been presented at OLS'09 and LCA'10.
 Briefly, performance gains can be significant on most workloads,
@@ -206,7 +225,7 @@ single-core systems with slow memory-copy speeds, cleancache
 has little value, but in newer multicore machines, especially
 consolidated/virtualized machines, it has great value.
-6) How do I add cleancache support for filesystem X? (Boaz Harrash)
+* How do I add cleancache support for filesystem X? (Boaz Harrash)
 Filesystems that are well-behaved and conform to certain
 restrictions can utilize cleancache simply by making a call to
@@ -217,26 +236,26 @@ not enable the optional cleancache.
 Some points for a filesystem to consider:
- The FS should be block-device-based (e.g. a ram-based FS such
+  - The FS should be block-device-based (e.g. a ram-based FS such
    as tmpfs should not enable cleancache)
- To ensure coherency/correctness, the FS must ensure that all
+  - To ensure coherency/correctness, the FS must ensure that all
    file removal or truncation operations either go through VFS or
    add hooks to do the equivalent cleancache "invalidate" operations
- To ensure coherency/correctness, either inode numbers must
+  - To ensure coherency/correctness, either inode numbers must
    be unique across the lifetime of the on-disk file OR the
    FS must provide an "encode_fh" function.
- The FS must call the VFS superblock alloc and deactivate routines
+  - The FS must call the VFS superblock alloc and deactivate routines
    or add hooks to do the equivalent cleancache calls done there.
- To maximize performance, all pages fetched from the FS should
+  - To maximize performance, all pages fetched from the FS should
    go through the do_mpag_readpage routine or the FS should add
    hooks to do the equivalent (cf. btrfs)
- Currently, the FS blocksize must be the same as PAGESIZE.  This
+  - Currently, the FS blocksize must be the same as PAGESIZE.  This
    is not an architectural restriction, but no backends currently
    support anything different.
- A clustered FS should invoke the "shared_init_fs" cleancache
+  - A clustered FS should invoke the "shared_init_fs" cleancache
    hook to get best performance for some backends.
-7) Why not use the KVA of the inode as the key? (Christoph Hellwig)
+* Why not use the KVA of the inode as the key? (Christoph Hellwig)
 If cleancache would use the inode virtual address instead of
 inode/filehandle, the pool id could be eliminated.  But, this
@@ -251,7 +270,7 @@ of cleancache would be lost because the cache of pages in cleanache
 is potentially much larger than the kernel pagecache and is most
 useful if the pages survive inode cache removal.
-8) Why is a global variable required?
+* Why is a global variable required?
 The cleancache_enabled flag is checked in all of the frequently-used
 cleancache hooks.  The alternative is a function call to check a static
@@ -262,13 +281,13 @@ global variable allows cleancache to be enabled by default at compile
 time, but have insignificant performance impact when cleancache remains
 disabled at runtime.
-9) Does cleanache work with KVM?
+* Does cleanache work with KVM?
 The memory model of KVM is sufficiently different that a cleancache
 backend may have less value for KVM.  This remains to be tested,
 especially in an overcommitted system.
-10) Does cleancache work in userspace?  It sounds useful for
+* Does cleancache work in userspace?  It sounds useful for
  memory hungry caches like web browsers.  (Jamie Lokier)
 No plans yet, though we agree it sounds useful, at least for

--- a/Documentation/vm/conf.py
+++ b/Documentation/vm/conf.py
+# -*- coding: utf-8; mode: python -*-
+project = "Linux Memory Management Documentation"
+tags.add("subproject")
+latex_documents = [
+    ('index', 'memory-management.tex', project,
+     'The kernel development community', 'manual'),
+]
--- a/Documentation/vm/frontswap.txt
+++ b/Documentation/vm/frontswap.txt
+.. _frontswap:
+=========
+Frontswap
+=========
 Frontswap provides a "transcendent memory" interface for swap pages.
 In some environments, dramatic performance savings may be obtained because
 swapped pages are saved in RAM (or a RAM-like device) instead of a swap disk.
-(Note, frontswap -- and cleancache (merged at 3.0) -- are the "frontends"
+(Note, frontswap -- and :ref:`cleancache` (merged at 3.0) -- are the "frontends"
 and the only necessary changes to the core kernel for transcendent memory;
 all other supporting code -- the "backends" -- is implemented as drivers.
-See the LWN.net article "Transcendent memory in a nutshell" for a detailed
+See the LWN.net article `Transcendent memory in a nutshell`_
-overview of frontswap and related kernel parts:
+for a detailed overview of frontswap and related kernel parts)
-https://lwn.net/Articles/454795/ )
+.. _Transcendent memory in a nutshell: https://lwn.net/Articles/454795/
 Frontswap is so named because it can be thought of as the opposite of
 a "backing" store for a swap device.  The storage is assumed to be
@@ -50,19 +57,27 @@ or the store fails AND the page is invalidated.  This ensures stale data may
 never be obtained from frontswap.
 If properly configured, monitoring of frontswap is done via debugfs in
-the /sys/kernel/debug/frontswap directory.  The effectiveness of
+the `/sys/kernel/debug/frontswap` directory.  The effectiveness of
 frontswap can be measured (across all swap devices) with:
-failed_stores	- how many store attempts have failed
+``failed_stores``
-loads		- how many loads were attempted (all should succeed)
+	how many store attempts have failed
-succ_stores	- how many store attempts have succeeded
-invalidates	- how many invalidates were attempted
+``loads``
+	how many loads were attempted (all should succeed)
+``succ_stores``
+	how many store attempts have succeeded
+``invalidates``
+	how many invalidates were attempted
 A backend implementation may provide additional metrics.
 FAQ
+===
-1) Where's the value?
+* Where's the value?
 When a workload starts swapping, performance falls through the floor.
 Frontswap significantly increases performance in many such workloads by
@@ -117,7 +132,7 @@ A KVM implementation is underway and has been RFC'ed to lkml.  And,
 using frontswap, investigation is also underway on the use of NVM as
 a memory extension technology.
-2) Sure there may be performance advantages in some situations, but
+* Sure there may be performance advantages in some situations, but
  what's the space/time overhead of frontswap?
 If CONFIG_FRONTSWAP is disabled, every frontswap hook compiles into
@@ -148,7 +163,7 @@ pressure that can potentially outweigh the other advantages.  A
 backend, such as zcache, must implement policies to carefully (but
 dynamically) manage memory limits to ensure this doesn't happen.
-3) OK, how about a quick overview of what this frontswap patch does
+* OK, how about a quick overview of what this frontswap patch does
  in terms that a kernel hacker can grok?
 Let's assume that a frontswap "backend" has registered during
@@ -188,7 +203,7 @@ and (potentially) a swap device write are replaced by a "frontswap backend
 store" and (possibly) a "frontswap backend loads", which are presumably much
 faster.
-4) Can't frontswap be configured as a "special" swap device that is
+* Can't frontswap be configured as a "special" swap device that is
  just higher priority than any real swap device (e.g. like zswap,
  or maybe swap-over-nbd/NFS)?
@@ -240,7 +255,7 @@ installation, frontswap is useless.  Swapless portable devices
 can still use frontswap but a backend for such devices must configure
 some kind of "ghost" swap device and ensure that it is never used.
-5) Why this weird definition about "duplicate stores"?  If a page
+* Why this weird definition about "duplicate stores"?  If a page
  has been previously successfully stored, can't it always be
  successfully overwritten?
@@ -254,7 +269,7 @@ the old data and ensure that it is no longer accessible.  Since the
 swap subsystem then writes the new data to the read swap device,
 this is the correct course of action to ensure coherency.
-6) What is frontswap_shrink for?
+* What is frontswap_shrink for?
 When the (non-frontswap) swap subsystem swaps out a page to a real
 swap device, that page is only taking up low-value pre-allocated disk
@@ -267,7 +282,7 @@ to "repatriate" pages sent to a remote machine back to the local machine;
 this is driven using the frontswap_shrink mechanism when memory pressure
 subsides.
-7) Why does the frontswap patch create the new include file swapfile.h?
+* Why does the frontswap patch create the new include file swapfile.h?
 The frontswap code depends on some swap-subsystem-internal data
 structures that have, over the years, moved back and forth between

--- a/Documentation/vm/highmem.txt
+++ b/Documentation/vm/highmem.txt
+.. _highmem:
-			     ====================
+====================
-			     HIGH MEMORY HANDLING
+High Memory Handling
-			     ====================
+====================
 By: Peter Zijlstra <a.p.zijlstra@chello.nl>
-Contents:
+.. contents:: :local:
- (*) What is high memory?
- (*) Temporary virtual mappings.
- (*) Using kmap_atomic.
- (*) Cost of temporary mappings.
- (*) i386 PAE.
+What Is High Memory?
-====================
-WHAT IS HIGH MEMORY?
 ====================
 High memory (highmem) is used when the size of physical memory approaches or
@@ -38,7 +27,7 @@ kernel entry/exit.  This means the available virtual memory space (4GiB on
 i386) has to be divided between user and kernel space.
 The traditional split for architectures using this approach is 3:1, 3GiB for
-userspace and the top 1GiB for kernel space:
+userspace and the top 1GiB for kernel space::
 		+--------+ 0xffffffff
 		| Kernel |
@@ -58,22 +47,21 @@ and user maps.  Some hardware (like some ARMs), however, have limited virtual
 space when they use mm context tags.
-==========================
+Temporary Virtual Mappings
-TEMPORARY VIRTUAL MAPPINGS
 ==========================
 The kernel contains several ways of creating temporary mappings:
- (*) vmap().  This can be used to make a long duration mapping of multiple
+* vmap().  This can be used to make a long duration mapping of multiple
  physical pages into a contiguous virtual space.  It needs global
  synchronization to unmap.
- (*) kmap().  This permits a short duration mapping of a single page.  It needs
+* kmap().  This permits a short duration mapping of a single page.  It needs
  global synchronization, but is amortized somewhat.  It is also prone to
  deadlocks when using in a nested fashion, and so it is not recommended for
  new code.
- (*) kmap_atomic().  This permits a very short duration mapping of a single
+* kmap_atomic().  This permits a very short duration mapping of a single
  page.  Since the mapping is restricted to the CPU that issued it, it
  performs well, but the issuing task is therefore required to stay on that
  CPU until it has finished, lest some other task displace its mappings.
@@ -84,14 +72,13 @@ The kernel contains several ways of creating temporary mappings:
  It may be assumed that k[un]map_atomic() won't fail.
-=================
+Using kmap_atomic
-USING KMAP_ATOMIC
 =================
 When and where to use kmap_atomic() is straightforward.  It is used when code
 wants to access the contents of a page that might be allocated from high memory
 (see __GFP_HIGHMEM), for example a page in the pagecache.  The API has two
-functions, and they can be used in a manner similar to the following:
+functions, and they can be used in a manner similar to the following::
 	/* Find the page of interest. */
 	struct page *page = find_get_page(mapping, offset);
@@ -109,7 +96,7 @@ Note that the kunmap_atomic() call takes the result of the kmap_atomic() call
 not the argument.
 If you need to map two pages because you want to copy from one page to
-another you need to keep the kmap_atomic calls strictly nested, like:
+another you need to keep the kmap_atomic calls strictly nested, like::
 	vaddr1 = kmap_atomic(page1);
 	vaddr2 = kmap_atomic(page2);
@@ -120,8 +107,7 @@ another you need to keep the kmap_atomic calls strictly nested, like:
 	kunmap_atomic(vaddr1);
-==========================
+Cost of Temporary Mappings
-COST OF TEMPORARY MAPPINGS
 ==========================
 The cost of creating temporary mappings can be quite high.  The arch has to
@@ -136,22 +122,21 @@ If CONFIG_MMU is not set, then there can be no temporary mappings and no
 highmem.  In such a case, the arithmetic approach will also be used.
-========
 i386 PAE
 ========
 The i386 arch, under some circumstances, will permit you to stick up to 64GiB
 of RAM into your 32-bit machine.  This has a number of consequences:
- (*) Linux needs a page-frame structure for each page in the system and the
+* Linux needs a page-frame structure for each page in the system and the
  pageframes need to live in the permanent mapping, which means:
- (*) you can have 896M/sizeof(struct page) page-frames at most; with struct
+* you can have 896M/sizeof(struct page) page-frames at most; with struct
  page being 32-bytes that would end up being something in the order of 112G
  worth of pages; the kernel, however, needs to store more than just
  page-frames in that memory...
- (*) PAE makes your page tables larger - which slows the system down as more
+* PAE makes your page tables larger - which slows the system down as more
  data has to be accessed to traverse in TLB fills and the like.  One
  advantage is that PAE has more PTE bits and can provide advanced features
  like NX and PAT.

--- a/Documentation/vm/hmm.txt
+++ b/Documentation/vm/hmm.txt
+.. hmm:
+=====================================
 Heterogeneous Memory Management (HMM)
+=====================================
 Provide infrastructure and helpers to integrate non-conventional memory (device
 memory like GPU on board memory) into regular kernel path, with the cornerstone
@@ -6,10 +10,10 @@ of this being specialized struct page for such memory (see sections 5 to 7 of
 this document).
 HMM also provides optional helpers for SVM (Share Virtual Memory), i.e.,
-allowing a device to transparently access program address coherently with the
+allowing a device to transparently access program address coherently with
-CPU meaning that any valid pointer on the CPU is also a valid pointer for the
+the CPU meaning that any valid pointer on the CPU is also a valid pointer
-device. This is becoming mandatory to simplify the use of advanced hetero-
+for the device. This is becoming mandatory to simplify the use of advanced
-geneous computing where GPU, DSP, or FPGA are used to perform various
+heterogeneous computing where GPU, DSP, or FPGA are used to perform various
 computations on behalf of a process.
 This document is divided as follows: in the first section I expose the problems
@@ -21,19 +25,10 @@ fifth section deals with how device memory is represented inside the kernel.
 Finally, the last section presents a new migration helper that allows lever-
 aging the device DMA engine.
+.. contents:: :local:
-1) Problems of using a device specific memory allocator:
+Problems of using a device specific memory allocator
-2) I/O bus, device memory characteristics
+====================================================
-3) Shared address space and migration
-4) Address space mirroring implementation and API
-5) Represent and manage device memory from core kernel point of view
-6) Migration to and from device memory
-7) Memory cgroup (memcg) and rss accounting
-------------------------------------------------------------------------------
-1) Problems of using a device specific memory allocator:
 Devices with a large amount of on board memory (several gigabytes) like GPUs
 have historically managed their memory through dedicated driver specific APIs.
@@ -77,9 +72,8 @@ are only do-able with a shared address space. It is also more reasonable to use
 a shared address space for all other patterns.
-------------------------------------------------------------------------------
+I/O bus, device memory characteristics
+======================================
-2) I/O bus, device memory characteristics
 I/O buses cripple shared address spaces due to a few limitations. Most I/O
 buses only allow basic memory access from device to main memory; even cache
@@ -109,9 +103,8 @@ access any memory but we must also permit any memory to be migrated to device
 memory while device is using it (blocking CPU access while it happens).
-------------------------------------------------------------------------------
+Shared address space and migration
+==================================
-3) Shared address space and migration
 HMM intends to provide two main features. First one is to share the address
 space by duplicating the CPU page table in the device page table so the same
@@ -148,23 +141,23 @@ ages device memory by migrating the part of the data set that is actively being
 used by the device.
-------------------------------------------------------------------------------
+Address space mirroring implementation and API
+==============================================
-4) Address space mirroring implementation and API
 Address space mirroring's main objective is to allow duplication of a range of
 CPU page table into a device page table; HMM helps keep both synchronized. A
 device driver that wants to mirror a process address space must start with the
-registration of an hmm_mirror struct:
+registration of an hmm_mirror struct::
 int hmm_mirror_register(struct hmm_mirror *mirror,
                         struct mm_struct *mm);
 int hmm_mirror_register_locked(struct hmm_mirror *mirror,
                                struct mm_struct *mm);
 The locked variant is to be used when the driver is already holding mmap_sem
 of the mm in write mode. The mirror struct has a set of callbacks that are used
-to propagate CPU page tables:
+to propagate CPU page tables::
 struct hmm_mirror_ops {
     /* sync_cpu_device_pagetables() - synchronize page tables
@@ -193,9 +186,9 @@ The device driver must perform the update action to the range (mark range
 read only, or fully unmap, ...). The device must be done with the update before
 the driver callback returns.
 When the device driver wants to populate a range of virtual addresses, it can
-use either:
+use either::
  int hmm_vma_get_pfns(struct vm_area_struct *vma,
                      struct hmm_range *range,
                      unsigned long start,
@@ -221,7 +214,7 @@ provides a set of flags to help the driver identify special CPU page table
 entries.
 Locking with the update() callback is the most important aspect the driver must
-respect in order to keep things properly synchronized. The usage pattern is:
+respect in order to keep things properly synchronized. The usage pattern is::
 int driver_populate_range(...)
 {
@@ -262,9 +255,8 @@ report commands as executed is serialized (there is no point in doing this
 concurrently).
-------------------------------------------------------------------------------
+Represent and manage device memory from core kernel point of view
+=================================================================
-5) Represent and manage device memory from core kernel point of view
 Several different designs were tried to support device memory. First one used
 a device specific data structure to keep information about migrated memory and
@@ -280,14 +272,14 @@ unaware of the difference. We only need to make sure that no one ever tries to
 map those pages from the CPU side.
 HMM provides a set of helpers to register and hotplug device memory as a new
-region needing a struct page. This is offered through a very simple API:
+region needing a struct page. This is offered through a very simple API::
 struct hmm_devmem *hmm_devmem_add(const struct hmm_devmem_ops *ops,
                                   struct device *device,
                                   unsigned long size);
 void hmm_devmem_remove(struct hmm_devmem *devmem);
-The hmm_devmem_ops is where most of the important things are:
+The hmm_devmem_ops is where most of the important things are::
 struct hmm_devmem_ops {
     void (*free)(struct hmm_devmem *devmem, struct page *page);
@@ -306,13 +298,12 @@ which it cannot do. This second callback must trigger a migration back to
 system memory.
-------------------------------------------------------------------------------
+Migration to and from device memory
+===================================
-6) Migration to and from device memory
 Because the CPU cannot access device memory, migration must use the device DMA
 engine to perform copy from and to device memory. For this we need a new
-migration helper:
+migration helper::
 int migrate_vma(const struct migrate_vma_ops *ops,
                 struct vm_area_struct *vma,
@@ -331,7 +322,7 @@ migration might be for a range of addresses the device is actively accessing.
 The migrate_vma_ops struct defines two callbacks. First one (alloc_and_copy())
 controls destination memory allocation and copy operation. Second one is there
-to allow the device driver to perform cleanup operations after migration.
+to allow the device driver to perform cleanup operations after migration::
 struct migrate_vma_ops {
     void (*alloc_and_copy)(struct vm_area_struct *vma,
@@ -365,9 +356,8 @@ bandwidth but this is considered as a rare event and a price that we are
 willing to pay to keep all the code simpler.
-------------------------------------------------------------------------------
+Memory cgroup (memcg) and rss accounting
+========================================
-7) Memory cgroup (memcg) and rss accounting
 For now device memory is accounted as any regular page in rss counters (either
 anonymous if device page is used for anonymous, file if device page is used for

--- a/Documentation/vm/hugetlbfs_reserv.txt
+++ b/Documentation/vm/hugetlbfs_reserv.txt
-Hugetlbfs Reservation Overview
+.. _hugetlbfs_reserve:
------------------------------
-Huge pages as described at 'Documentation/vm/hugetlbpage.txt' are typically
+=====================
+Hugetlbfs Reservation
+=====================
+Overview
+========
+Huge pages as described at :ref:`hugetlbpage` are typically
 preallocated for application use.  These huge pages are instantiated in a
 task's address space at page fault time if the VMA indicates huge pages are
 to be used.  If no huge page exists at page fault time, the task is sent
@@ -17,20 +24,22 @@ describe how huge page reserve processing is done in the v4.10 kernel.
 Audience
--------
+========
 This description is primarily targeted at kernel developers who are modifying
 hugetlbfs code.
 The Data Structures
-------------------
+===================
 resv_huge_pages
 	This is a global (per-hstate) count of reserved huge pages.  Reserved
 	huge pages are only available to the task which reserved them.
 	Therefore, the number of huge pages generally available is computed
-	as (free_huge_pages - resv_huge_pages).
+	as (``free_huge_pages - resv_huge_pages``).
 Reserve Map
-	A reserve map is described by the structure:
+	A reserve map is described by the structure::
 		struct resv_map {
 			struct kref refs;
 			spinlock_t lock;
@@ -39,25 +48,31 @@ Reserve Map
 			struct list_head region_cache;
 			long region_cache_count;
 		};
 	There is one reserve map for each huge page mapping in the system.
 	The regions list within the resv_map describes the regions within
-	the mapping.  A region is described as:
+	the mapping.  A region is described as::
 		struct file_region {
 			struct list_head link;
 			long from;
 			long to;
 		};
 	The 'from' and 'to' fields of the file region structure are huge page
 	indices into the mapping.  Depending on the type of mapping, a
 	region in the reserv_map may indicate reservations exist for the
 	range, or reservations do not exist.
 Flags for MAP_PRIVATE Reservations
 	These are stored in the bottom bits of the reservation map pointer.
-	#define HPAGE_RESV_OWNER    (1UL << 0) Indicates this task is the
-		owner of the reservations associated with the mapping.
+	``#define HPAGE_RESV_OWNER    (1UL << 0)``
-	#define HPAGE_RESV_UNMAPPED (1UL << 1) Indicates task originally
+		Indicates this task is the owner of the reservations
-		mapping this range (and creating reserves) has unmapped a
+		associated with the mapping.
-		page from this task (the child) due to a failed COW.
+	``#define HPAGE_RESV_UNMAPPED (1UL << 1)``
+		Indicates task originally mapping this range (and creating
+		reserves) has unmapped a page from this task (the child)
+		due to a failed COW.
 Page Flags
 	The PagePrivate page flag is used to indicate that a huge page
 	reservation must be restored when the huge page is freed.  More
@@ -65,12 +80,14 @@ Page Flags
 Reservation Map Location (Private or Shared)
--------------------------------------------
+============================================
 A huge page mapping or segment is either private or shared.  If private,
 it is typically only available to a single address space (task).  If shared,
 it can be mapped into multiple address spaces (tasks).  The location and
 semantics of the reservation map is significantly different for two types
 of mappings.  Location differences are:
 - For private mappings, the reservation map hangs off the the VMA structure.
  Specifically, vma->vm_private_data.  This reserve map is created at the
  time the mapping (mmap(MAP_PRIVATE)) is created.
@@ -82,12 +99,12 @@ of mappings.  Location differences are:
 Creating Reservations
---------------------
+=====================
 Reservations are created when a huge page backed shared memory segment is
 created (shmget(SHM_HUGETLB)) or a mapping is created via mmap(MAP_HUGETLB).
-These operations result in a call to the routine hugetlb_reserve_pages()
+These operations result in a call to the routine hugetlb_reserve_pages()::
-int hugetlb_reserve_pages(struct inode *inode,
+	int hugetlb_reserve_pages(struct inode *inode,
 				  long from, long to,
 				  struct vm_area_struct *vma,
 				  vm_flags_t vm_flags)
@@ -105,6 +122,7 @@ the 'from' and 'to' arguments have been adjusted by this offset.
 One of the big differences between PRIVATE and SHARED mappings is the way
 in which reservations are represented in the reservation map.
 - For shared mappings, an entry in the reservation map indicates a reservation
  exists or did exist for the corresponding page.  As reservations are
  consumed, the reservation map is not modified.
@@ -121,12 +139,13 @@ to indicate this VMA owns the reservations.
 The reservation map is consulted to determine how many huge page reservations
 are needed for the current mapping/segment.  For private mappings, this is
 always the value (to - from).  However, for shared mappings it is possible that some reservations may already exist within the range (to - from).  See the
-section "Reservation Map Modifications" for details on how this is accomplished.
+section :ref:`Reservation Map Modifications <resv_map_modifications>`
+for details on how this is accomplished.
 The mapping may be associated with a subpool.  If so, the subpool is consulted
 to ensure there is sufficient space for the mapping.  It is possible that the
 subpool has set aside reservations that can be used for the mapping.  See the
-section "Subpool Reservations" for more details.
+section :ref:`Subpool Reservations <sub_pool_resv>` for more details.
 After consulting the reservation map and subpool, the number of needed new
 reservations is known.  The routine hugetlb_acct_memory() is called to check
@@ -135,9 +154,11 @@ calls into routines that potentially allocate and adjust surplus page counts.
 However, within those routines the code is simply checking to ensure there
 are enough free huge pages to accommodate the reservation.  If there are,
 the global reservation count resv_huge_pages is adjusted something like the
-following.
+following::
 	if (resv_needed <= (resv_huge_pages - free_huge_pages))
 		resv_huge_pages += resv_needed;
 Note that the global lock hugetlb_lock is held when checking and adjusting
 these counters.
@@ -152,14 +173,18 @@ If hugetlb_reserve_pages() was successful, the global reservation count and
 reservation map associated with the mapping will be modified as required to
 ensure reservations exist for the range 'from' - 'to'.
+.. _consume_resv:
 Consuming Reservations/Allocating a Huge Page
---------------------------------------------
+=============================================
 Reservations are consumed when huge pages associated with the reservations
 are allocated and instantiated in the corresponding mapping.  The allocation
-is performed within the routine alloc_huge_page().
+is performed within the routine alloc_huge_page()::
-struct page *alloc_huge_page(struct vm_area_struct *vma,
+	struct page *alloc_huge_page(struct vm_area_struct *vma,
 				     unsigned long addr, int avoid_reserve)
 alloc_huge_page is passed a VMA pointer and a virtual address, so it can
 consult the reservation map to determine if a reservation exists.  In addition,
 alloc_huge_page takes the argument avoid_reserve which indicates reserves
@@ -170,8 +195,9 @@ page are being allocated.
 The helper routine vma_needs_reservation() is called to determine if a
 reservation exists for the address within the mapping(vma).  See the section
-"Reservation Map Helper Routines" for detailed information on what this
+:ref:`Reservation Map Helper Routines <resv_map_helpers>` for detailed
-routine does.  The value returned from vma_needs_reservation() is generally
+information on what this routine does.
+The value returned from vma_needs_reservation() is generally
 0 or 1.  0 if a reservation exists for the address, 1 if no reservation exists.
 If a reservation does not exist, and there is a subpool associated with the
 mapping the subpool is consulted to determine if it contains reservations.
@@ -180,21 +206,25 @@ However, in every case the avoid_reserve argument overrides the use of
 a reservation for the allocation.  After determining whether a reservation
 exists and can be used for the allocation, the routine dequeue_huge_page_vma()
 is called.  This routine takes two arguments related to reservations:
 - avoid_reserve, this is the same value/argument passed to alloc_huge_page()
 - chg, even though this argument is of type long only the values 0 or 1 are
  passed to dequeue_huge_page_vma.  If the value is 0, it indicates a
  reservation exists (see the section "Memory Policy and Reservations" for
  possible issues).  If the value is 1, it indicates a reservation does not
  exist and the page must be taken from the global free pool if possible.
 The free lists associated with the memory policy of the VMA are searched for
 a free page.  If a page is found, the value free_huge_pages is decremented
 when the page is removed from the free list.  If there was a reservation
-associated with the page, the following adjustments are made:
+associated with the page, the following adjustments are made::
 	SetPagePrivate(page);	/* Indicates allocating this page consumed
 				 * a reservation, and if an error is
 				 * encountered such that the page must be
 				 * freed, the reservation will be restored. */
 	resv_huge_pages--;	/* Decrement the global reservation count */
 Note, if no huge page can be found that satisfies the VMA's memory policy
 an attempt will be made to allocate one using the buddy allocator.  This
 brings up the issue of surplus huge pages and overcommit which is beyond
@@ -222,12 +252,14 @@ mapping.  In such cases, the reservation count and subpool free page count
 will be off by one.  This rare condition can be identified by comparing the
 return value from vma_needs_reservation and vma_commit_reservation.  If such
 a race is detected, the subpool and global reserve counts are adjusted to
-compensate.  See the section "Reservation Map Helper Routines" for more
+compensate.  See the section
+:ref:`Reservation Map Helper Routines <resv_map_helpers>` for more
 information on these routines.
 Instantiate Huge Pages
----------------------
+======================
 After huge page allocation, the page is typically added to the page tables
 of the allocating task.  Before this, pages in a shared mapping are added
 to the page cache and pages in private mappings are added to an anonymous
@@ -237,7 +269,8 @@ to the global reservation count (resv_huge_pages).
 Freeing Huge Pages
------------------
+==================
 Huge page freeing is performed by the routine free_huge_page().  This routine
 is the destructor for hugetlbfs compound pages.  As a result, it is only
 passed a pointer to the page struct.  When a huge page is freed, reservation
@@ -247,7 +280,8 @@ on an error path where a global reserve count must be restored.
 The page->private field points to any subpool associated with the page.
 If the PagePrivate flag is set, it indicates the global reserve count should
-be adjusted (see the section "Consuming Reservations/Allocating a Huge Page"
+be adjusted (see the section
+:ref:`Consuming Reservations/Allocating a Huge Page <consume_resv>`
 for information on how these are set).
 The routine first calls hugepage_subpool_put_pages() for the page.  If this
@@ -259,9 +293,11 @@ Therefore, the global resv_huge_pages counter is incremented in this case.
 If the PagePrivate flag was set in the page, the global resv_huge_pages counter
 will always be incremented.
+.. _sub_pool_resv:
 Subpool Reservations
--------------------
+====================
 There is a struct hstate associated with each huge page size.  The hstate
 tracks all huge pages of the specified size.  A subpool represents a subset
 of pages within a hstate that is associated with a mounted hugetlbfs
@@ -295,7 +331,8 @@ the global pools.
 COW and Reservations
--------------------
+====================
 Since shared mappings all point to and use the same underlying pages, the
 biggest reservation concern for COW is private mappings.  In this case,
 two tasks can be pointing at the same previously allocated page.  One task
@@ -326,30 +363,36 @@ faults on a non-present page.  But, the original owner of the
 mapping/reservation will behave as expected.
+.. _resv_map_modifications:
 Reservation Map Modifications
-----------------------------
+=============================
 The following low level routines are used to make modifications to a
 reservation map.  Typically, these routines are not called directly.  Rather,
 a reservation map helper routine is called which calls one of these low level
 routines.  These low level routines are fairly well documented in the source
-code (mm/hugetlb.c).  These routines are:
+code (mm/hugetlb.c).  These routines are::
-long region_chg(struct resv_map *resv, long f, long t);
-long region_add(struct resv_map *resv, long f, long t);
+	long region_chg(struct resv_map *resv, long f, long t);
-void region_abort(struct resv_map *resv, long f, long t);
+	long region_add(struct resv_map *resv, long f, long t);
-long region_count(struct resv_map *resv, long f, long t);
+	void region_abort(struct resv_map *resv, long f, long t);
+	long region_count(struct resv_map *resv, long f, long t);
 Operations on the reservation map typically involve two operations:
 1) region_chg() is called to examine the reserve map and determine how
   many pages in the specified range [f, t) are NOT currently represented.
   The calling code performs global checks and allocations to determine if
   there are enough huge pages for the operation to succeed.
-2a) If the operation can succeed, region_add() is called to actually modify
+2)
+  a) If the operation can succeed, region_add() is called to actually modify
     the reservation map for the same range [f, t) previously passed to
     region_chg().
-2b) If the operation can not succeed, region_abort is called for the same range
+  b) If the operation can not succeed, region_abort is called for the same
-    [f, t) to abort the operation.
+     range [f, t) to abort the operation.
 Note that this is a two step process where region_add() and region_abort()
 are guaranteed to succeed after a prior call to region_chg() for the same
@@ -371,6 +414,7 @@ and make the appropriate adjustments.
 The routine region_del() is called to remove regions from a reservation map.
 It is typically called in the following situations:
 - When a file in the hugetlbfs filesystem is being removed, the inode will
  be released and the reservation map freed.  Before freeing the reservation
  map, all the individual file_region structures must be freed.  In this case
@@ -384,6 +428,7 @@ It is typically called in the following situations:
  removed, region_del() is called to remove the corresponding entry from the
  reservation map.  In this case, region_del is passed the range
  [page_idx, page_idx + 1).
 In every case, region_del() will return the number of pages removed from the
 reservation map.  In VERY rare cases, region_del() can fail.  This can only
 happen in the hole punch case where it has to split an existing file_region
@@ -403,9 +448,11 @@ outstanding (outstanding = (end - start) - region_count(resv, start, end)).
 Since the mapping is going away, the subpool and global reservation counts
 are decremented by the number of outstanding reservations.
+.. _resv_map_helpers:
 Reservation Map Helper Routines
-------------------------------
+===============================
 Several helper routines exist to query and modify the reservation maps.
 These routines are only interested with reservations for a specific huge
 page, so they just pass in an address instead of a range.  In addition,
@@ -414,32 +461,40 @@ or shared) and the location of the reservation map (inode or VMA) can be
 determined.  These routines simply call the underlying routines described
 in the section "Reservation Map Modifications".  However, they do take into
 account the 'opposite' meaning of reservation map entries for private and
-shared mappings and hide this detail from the caller.
+shared mappings and hide this detail from the caller::
+	long vma_needs_reservation(struct hstate *h,
+				   struct vm_area_struct *vma,
+				   unsigned long addr)
-long vma_needs_reservation(struct hstate *h,
-				struct vm_area_struct *vma, unsigned long addr)
 This routine calls region_chg() for the specified page.  If no reservation
-exists, 1 is returned.  If a reservation exists, 0 is returned.
+exists, 1 is returned.  If a reservation exists, 0 is returned::
+	long vma_commit_reservation(struct hstate *h,
+				    struct vm_area_struct *vma,
+				    unsigned long addr)
-long vma_commit_reservation(struct hstate *h,
-				struct vm_area_struct *vma, unsigned long addr)
 This calls region_add() for the specified page.  As in the case of region_chg
 and region_add, this routine is to be called after a previous call to
 vma_needs_reservation.  It will add a reservation entry for the page.  It
 returns 1 if the reservation was added and 0 if not.  The return value should
 be compared with the return value of the previous call to
 vma_needs_reservation.  An unexpected difference indicates the reservation
-map was modified between calls.
+map was modified between calls::
+	void vma_end_reservation(struct hstate *h,
+				 struct vm_area_struct *vma,
+				 unsigned long addr)
-void vma_end_reservation(struct hstate *h,
-				struct vm_area_struct *vma, unsigned long addr)
 This calls region_abort() for the specified page.  As in the case of region_chg
 and region_abort, this routine is to be called after a previous call to
 vma_needs_reservation.  It will abort/end the in progress reservation add
-operation.
+operation::
+	long vma_add_reservation(struct hstate *h,
+				 struct vm_area_struct *vma,
+				 unsigned long addr)
-long vma_add_reservation(struct hstate *h,
-				struct vm_area_struct *vma, unsigned long addr)
 This is a special wrapper routine to help facilitate reservation cleanup
 on error paths.  It is only called from the routine restore_reserve_on_error().
 This routine is used in conjunction with vma_needs_reservation in an attempt
@@ -453,8 +508,10 @@ be done on error paths.
 Reservation Cleanup in Error Paths
----------------------------------
+==================================
-As mentioned in the section "Reservation Map Helper Routines", reservation
+As mentioned in the section
+:ref:`Reservation Map Helper Routines <resv_map_helpers>`, reservation
 map modifications are performed in two steps.  First vma_needs_reservation
 is called before a page is allocated.  If the allocation is successful,
 then vma_commit_reservation is called.  If not, vma_end_reservation is called.
@@ -494,13 +551,14 @@ so that a reservation will not be leaked when the huge page is freed.
 Reservations and Memory Policy
------------------------------
+==============================
 Per-node huge page lists existed in struct hstate when git was first used
 to manage Linux code.  The concept of reservations was added some time later.
 When reservations were added, no attempt was made to take memory policy
 into account.  While cpusets are not exactly the same as memory policy, this
 comment in hugetlb_acct_memory sums up the interaction between reservations
-and cpusets/memory policy.
+and cpusets/memory policy::
 	/*
 	 * When cpuset is configured, it breaks the strict hugetlb page
 	 * reservation as the accounting is done on a global variable. Such

--- a/Documentation/vm/hugetlbpage.txt
+++ b/Documentation/vm/hugetlbpage.txt
--- a/Documentation/vm/hwpoison.txt
+++ b/Documentation/vm/hwpoison.txt
+.. hwpoison:
+========
+hwpoison
+========
 What is hwpoison?
+=================
 Upcoming Intel CPUs have support for recovering from some memory errors
-(``MCA recovery''). This requires the OS to declare a page "poisoned",
+(``MCA recovery``). This requires the OS to declare a page "poisoned",
 kill the processes associated with it and avoid using it in the future.
 This patchkit implements the necessary infrastructure in the VM.
@@ -46,9 +53,10 @@ address. This in theory allows other applications to handle
 memory failures too. The expection is that near all applications
 won't do that, but some very specialized ones might.
---
+Failure recovery modes
+======================
-There are two (actually three) modi memory failure recovery can be in:
+There are two (actually three) modes memory failure recovery can be in:
 vm.memory_failure_recovery sysctl set to zero:
 	All memory failures cause a panic. Do not attempt recovery.
@@ -67,9 +75,8 @@ late kill
 	This is best for memory error unaware applications and default
 	Note some pages are always handled as late kill.
---
+User control
+============
-User control:
 vm.memory_failure_recovery
 	See sysctl.txt
@@ -79,11 +86,19 @@ vm.memory_failure_early_kill
 PR_MCE_KILL
 	Set early/late kill mode/revert to system default
-	arg1: PR_MCE_KILL_CLEAR: Revert to system default
-	arg1: PR_MCE_KILL_SET: arg2 defines thread specific mode
+	arg1: PR_MCE_KILL_CLEAR:
-		PR_MCE_KILL_EARLY: Early kill
+		Revert to system default
-		PR_MCE_KILL_LATE:  Late kill
+	arg1: PR_MCE_KILL_SET:
-		PR_MCE_KILL_DEFAULT: Use system global default
+		arg2 defines thread specific mode
+		PR_MCE_KILL_EARLY:
+			Early kill
+		PR_MCE_KILL_LATE:
+			Late kill
+		PR_MCE_KILL_DEFAULT
+			Use system global default
 	Note that if you want to have a dedicated thread which handles
 	the SIGBUS(BUS_MCEERR_AO) on behalf of the process, you should
 	call prctl(PR_MCE_KILL_EARLY) on the designated thread. Otherwise,
@@ -92,48 +107,38 @@ PR_MCE_KILL
 PR_MCE_KILL_GET
 	return current mode
+Testing
+=======
---
+* madvise(MADV_HWPOISON, ....) (as root) - Poison a page in the
+  process for testing
-Testing:
-madvise(MADV_HWPOISON, ....)
-	(as root)
-	Poison a page in the process for testing
-hwpoison-inject module through debugfs
+* hwpoison-inject module through debugfs ``/sys/kernel/debug/hwpoison/``
-/sys/kernel/debug/hwpoison/
+  corrupt-pfn
+	Inject hwpoison fault at PFN echoed into this file. This does
+	some early filtering to avoid corrupted unintended pages in test suites.
-corrupt-pfn
+  unpoison-pfn
+	Software-unpoison page at PFN echoed into this file. This way
+	a page can be reused again.  This only works for Linux
+	injected failures, not for real memory failures.
-Inject hwpoison fault at PFN echoed into this file. This does
+  Note these injection interfaces are not stable and might change between
-some early filtering to avoid corrupted unintended pages in test suites.
+  kernel versions
-unpoison-pfn
+  corrupt-filter-dev-major, corrupt-filter-dev-minor
+	Only handle memory failures to pages associated with the file
+	system defined by block device major/minor.  -1U is the
+	wildcard value.  This should be only used for testing with
+	artificial injection.
-Software-unpoison page at PFN echoed into this file. This
+  corrupt-filter-memcg
-way a page can be reused again.
+	Limit injection to pages owned by memgroup. Specified by inode
-This only works for Linux injected failures, not for real
+	number of the memcg.
-memory failures.
-Note these injection interfaces are not stable and might change between
+	Example::
-kernel versions
-corrupt-filter-dev-major
-corrupt-filter-dev-minor
-Only handle memory failures to pages associated with the file system defined
-by block device major/minor.  -1U is the wildcard value.
-This should be only used for testing with artificial injection.
-corrupt-filter-memcg
-Limit injection to pages owned by memgroup. Specified by inode number
-of the memcg.
-Example:
 		mkdir /sys/fs/cgroup/mem/hwpoison
 	        usemem -m 100 -s 1000 &
@@ -145,24 +150,21 @@ Example:
 		page-types -p `pidof init`   --hwpoison  # shall do nothing
 		page-types -p `pidof usemem` --hwpoison  # poison its pages
-corrupt-filter-flags-mask
+  corrupt-filter-flags-mask, corrupt-filter-flags-value
-corrupt-filter-flags-value
+	When specified, only poison pages if ((page_flags & mask) ==
+	value).  This allows stress testing of many kinds of
-When specified, only poison pages if ((page_flags & mask) == value).
+	pages. The page_flags are the same as in /proc/kpageflags. The
-This allows stress testing of many kinds of pages. The page_flags
+	flag bits are defined in include/linux/kernel-page-flags.h and
-are the same as in /proc/kpageflags. The flag bits are defined in
+	documented in Documentation/vm/pagemap.rst
-include/linux/kernel-page-flags.h and documented in
-Documentation/vm/pagemap.txt
-Architecture specific MCE injector
+* Architecture specific MCE injector
-x86 has mce-inject, mce-test
+  x86 has mce-inject, mce-test
-Some portable hwpoison test programs in mce-test, see blow.
+  Some portable hwpoison test programs in mce-test, see below.
---
+References
+==========
-References:
 http://halobates.de/mce-lc09-2.pdf
 	Overview presentation from LinuxCon 09
@@ -174,14 +176,11 @@ git://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
 	x86 specific injector
---
+Limitations
+===========
-Limitations:
 - Not all page types are supported and never will. Most kernel internal
-objects cannot be recovered, only LRU pages for now.
+  objects cannot be recovered, only LRU pages for now.
 - Right now hugepage support is missing.
 ---
 Andi Kleen, Oct 2009
--- a/Documentation/vm/idle_page_tracking.txt
+++ b/Documentation/vm/idle_page_tracking.txt
-MOTIVATION
+.. _idle_page_tracking:
+==================
+Idle Page Tracking
+==================
+Motivation
+==========
 The idle page tracking feature allows to track which memory pages are being
 accessed by a workload and which are idle. This information can be useful for
@@ -8,10 +15,14 @@ or deciding where to place the workload within a compute cluster.
 It is enabled by CONFIG_IDLE_PAGE_TRACKING=y.
-USER API
+.. _user_api:
-The idle page tracking API is located at /sys/kernel/mm/page_idle. Currently,
+User API
-it consists of the only read-write file, /sys/kernel/mm/page_idle/bitmap.
+========
+The idle page tracking API is located at ``/sys/kernel/mm/page_idle``.
+Currently, it consists of the only read-write file,
+``/sys/kernel/mm/page_idle/bitmap``.
 The file implements a bitmap where each bit corresponds to a memory page. The
 bitmap is represented by an array of 8-byte integers, and the page at PFN #i is
@@ -19,8 +30,9 @@ mapped to bit #i%64 of array element #i/64, byte order is native. When a bit is
 set, the corresponding page is idle.
 A page is considered idle if it has not been accessed since it was marked idle
-(for more details on what "accessed" actually means see the IMPLEMENTATION
+(for more details on what "accessed" actually means see the :ref:`Implementation
-DETAILS section). To mark a page idle one has to set the bit corresponding to
+Details <impl_details>` section).
+To mark a page idle one has to set the bit corresponding to
 the page by writing to the file. A value written to the file is OR-ed with the
 current bitmap value.
@@ -30,9 +42,9 @@ page types (e.g. SLAB pages) an attempt to mark a page idle is silently ignored,
 and hence such pages are never reported idle.
 For huge pages the idle flag is set only on the head page, so one has to read
-/proc/kpageflags in order to correctly count idle huge pages.
+``/proc/kpageflags`` in order to correctly count idle huge pages.
-Reading from or writing to /sys/kernel/mm/page_idle/bitmap will return
+Reading from or writing to ``/sys/kernel/mm/page_idle/bitmap`` will return
 -EINVAL if you are not starting the read/write on an 8-byte boundary, or
 if the size of the read/write is not a multiple of 8 bytes. Writing to
 this file beyond max PFN will return -ENXIO.
@@ -41,21 +53,25 @@ That said, in order to estimate the amount of pages that are not used by a
 workload one should:
 1. Mark all the workload's pages as idle by setting corresponding bits in
-    /sys/kernel/mm/page_idle/bitmap. The pages can be found by reading
+    ``/sys/kernel/mm/page_idle/bitmap``. The pages can be found by reading
-    /proc/pid/pagemap if the workload is represented by a process, or by
+    ``/proc/pid/pagemap`` if the workload is represented by a process, or by
-    filtering out alien pages using /proc/kpagecgroup in case the workload is
+    filtering out alien pages using ``/proc/kpagecgroup`` in case the workload
-    placed in a memory cgroup.
+    is placed in a memory cgroup.
 2. Wait until the workload accesses its working set.
- 3. Read /sys/kernel/mm/page_idle/bitmap and count the number of bits set. If
+ 3. Read ``/sys/kernel/mm/page_idle/bitmap`` and count the number of bits set.
-    one wants to ignore certain types of pages, e.g. mlocked pages since they
+    If one wants to ignore certain types of pages, e.g. mlocked pages since they
-    are not reclaimable, he or she can filter them out using /proc/kpageflags.
+    are not reclaimable, he or she can filter them out using
+    ``/proc/kpageflags``.
+See Documentation/vm/pagemap.rst for more information about
+``/proc/pid/pagemap``, ``/proc/kpageflags``, and ``/proc/kpagecgroup``.
-See Documentation/vm/pagemap.txt for more information about /proc/pid/pagemap,
+.. _impl_details:
-/proc/kpageflags, and /proc/kpagecgroup.
-IMPLEMENTATION DETAILS
+Implementation Details
+======================
 The kernel internally keeps track of accesses to user memory pages in order to
 reclaim unreferenced pages first on memory shortage conditions. A page is
@@ -77,7 +93,8 @@ When a dirty page is written to swap or disk as a result of memory reclaim or
 exceeding the dirty memory limit, it is not marked referenced.
 The idle memory tracking feature adds a new page flag, the Idle flag. This flag
-is set manually, by writing to /sys/kernel/mm/page_idle/bitmap (see the USER API
+is set manually, by writing to ``/sys/kernel/mm/page_idle/bitmap`` (see the
+:ref:`User API <user_api>`
 section), and cleared automatically whenever a page is referenced as defined
 above.

--- a/Documentation/vm/index.rst
+++ b/Documentation/vm/index.rst
+=====================================
+Linux Memory Management Documentation
+=====================================
+This is a collection of documents about Linux memory management (mm) subsystem.
+User guides for MM features
+===========================
+The following documents provide guides for controlling and tuning
+various features of the Linux memory management
+.. toctree::
+   :maxdepth: 1
+   hugetlbpage
+   idle_page_tracking
+   ksm
+   numa_memory_policy
+   pagemap
+   transhuge
+   soft-dirty
+   swap_numa
+   userfaultfd
+   zswap
+Kernel developers MM documentation
+==================================
+The below documents describe MM internals with different level of
+details ranging from notes and mailing list responses to elaborate
+descriptions of data structures and algorithms.
+.. toctree::
+   :maxdepth: 1
+   active_mm
+   balance
+   cleancache
+   frontswap
+   highmem
+   hmm
+   hwpoison
+   hugetlbfs_reserv
+   mmu_notifier
+   numa
+   overcommit-accounting
+   page_migration
+   page_frags
+   page_owner
+   remap_file_pages
+   slub
+   split_page_table_lock
+   unevictable-lru
+   z3fold
+   zsmalloc
--- a/Documentation/vm/ksm.txt
+++ b/Documentation/vm/ksm.txt
-How to use the Kernel Samepage Merging feature
+.. _ksm:
----------------------------------------------
+=======================
+Kernel Samepage Merging
+=======================
 KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y,
-added to the Linux kernel in 2.6.32.  See mm/ksm.c for its implementation,
+added to the Linux kernel in 2.6.32.  See ``mm/ksm.c`` for its implementation,
 and http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/
 The KSM daemon ksmd periodically scans those areas of user memory which
@@ -51,110 +54,112 @@ Applications should be considerate in their use of MADV_MERGEABLE,
 restricting its use to areas likely to benefit.  KSM's scans may use a lot
 of processing power: some installations will disable KSM for that reason.
-The KSM daemon is controlled by sysfs files in /sys/kernel/mm/ksm/,
+The KSM daemon is controlled by sysfs files in ``/sys/kernel/mm/ksm/``,
 readable by all but writable only by root:
-pages_to_scan    - how many present pages to scan before ksmd goes to sleep
+pages_to_scan
-                   e.g. "echo 100 > /sys/kernel/mm/ksm/pages_to_scan"
+        how many present pages to scan before ksmd goes to sleep
-                   Default: 100 (chosen for demonstration purposes)
+        e.g. ``echo 100 > /sys/kernel/mm/ksm/pages_to_scan`` Default: 100
+        (chosen for demonstration purposes)
-sleep_millisecs  - how many milliseconds ksmd should sleep before next scan
-                   e.g. "echo 20 > /sys/kernel/mm/ksm/sleep_millisecs"
+sleep_millisecs
-                   Default: 20 (chosen for demonstration purposes)
+        how many milliseconds ksmd should sleep before next scan
+        e.g. ``echo 20 > /sys/kernel/mm/ksm/sleep_millisecs`` Default: 20
-merge_across_nodes - specifies if pages from different numa nodes can be merged.
+        (chosen for demonstration purposes)
-                   When set to 0, ksm merges only pages which physically
-                   reside in the memory area of same NUMA node. That brings
+merge_across_nodes
-                   lower latency to access of shared pages. Systems with more
+        specifies if pages from different numa nodes can be merged.
-                   nodes, at significant NUMA distances, are likely to benefit
+        When set to 0, ksm merges only pages which physically reside
-                   from the lower latency of setting 0. Smaller systems, which
+        in the memory area of same NUMA node. That brings lower
-                   need to minimize memory usage, are likely to benefit from
+        latency to access of shared pages. Systems with more nodes, at
-                   the greater sharing of setting 1 (default). You may wish to
+        significant NUMA distances, are likely to benefit from the
-                   compare how your system performs under each setting, before
+        lower latency of setting 0. Smaller systems, which need to
-                   deciding on which to use. merge_across_nodes setting can be
+        minimize memory usage, are likely to benefit from the greater
-                   changed only when there are no ksm shared pages in system:
+        sharing of setting 1 (default). You may wish to compare how
-                   set run 2 to unmerge pages first, then to 1 after changing
+        your system performs under each setting, before deciding on
+        which to use. merge_across_nodes setting can be changed only
+        when there are no ksm shared pages in system: set run 2 to
+        unmerge pages first, then to 1 after changing
        merge_across_nodes, to remerge according to the new setting.
        Default: 1 (merging across nodes as in earlier releases)
-run              - set 0 to stop ksmd from running but keep merged pages,
+run
-                   set 1 to run ksmd e.g. "echo 1 > /sys/kernel/mm/ksm/run",
+        set 0 to stop ksmd from running but keep merged pages,
-                   set 2 to stop ksmd and unmerge all pages currently merged,
+        set 1 to run ksmd e.g. ``echo 1 > /sys/kernel/mm/ksm/run``,
-                         but leave mergeable areas registered for next run
+        set 2 to stop ksmd and unmerge all pages currently merged, but
-                   Default: 0 (must be changed to 1 to activate KSM,
+        leave mergeable areas registered for next run Default: 0 (must
-                               except if CONFIG_SYSFS is disabled)
+        be changed to 1 to activate KSM, except if CONFIG_SYSFS is
+        disabled)
-use_zero_pages   - specifies whether empty pages (i.e. allocated pages
-                   that only contain zeroes) should be treated specially.
+use_zero_pages
-                   When set to 1, empty pages are merged with the kernel
+        specifies whether empty pages (i.e. allocated pages that only
-                   zero page(s) instead of with each other as it would
+        contain zeroes) should be treated specially.  When set to 1,
-                   happen normally. This can improve the performance on
+        empty pages are merged with the kernel zero page(s) instead of
-                   architectures with coloured zero pages, depending on
+        with each other as it would happen normally. This can improve
-                   the workload. Care should be taken when enabling this
+        the performance on architectures with coloured zero pages,
-                   setting, as it can potentially degrade the performance
+        depending on the workload. Care should be taken when enabling
-                   of KSM for some workloads, for example if the checksums
+        this setting, as it can potentially degrade the performance of
-                   of pages candidate for merging match the checksum of
+        KSM for some workloads, for example if the checksums of pages
-                   an empty page. This setting can be changed at any time,
+        candidate for merging match the checksum of an empty
-                   it is only effective for pages merged after the change.
+        page. This setting can be changed at any time, it is only
-                   Default: 0 (normal KSM behaviour as in earlier releases)
+        effective for pages merged after the change.  Default: 0
+        (normal KSM behaviour as in earlier releases)
-max_page_sharing - Maximum sharing allowed for each KSM page. This
-                   enforces a deduplication limit to avoid the virtual
+max_page_sharing
-                   memory rmap lists to grow too large. The minimum
+        Maximum sharing allowed for each KSM page. This enforces a
-                   value is 2 as a newly created KSM page will have at
+        deduplication limit to avoid the virtual memory rmap lists to
-                   least two sharers. The rmap walk has O(N)
+        grow too large. The minimum value is 2 as a newly created KSM
-                   complexity where N is the number of rmap_items
+        page will have at least two sharers. The rmap walk has O(N)
-                   (i.e. virtual mappings) that are sharing the page,
+        complexity where N is the number of rmap_items (i.e. virtual
-                   which is in turn capped by max_page_sharing. So
+        mappings) that are sharing the page, which is in turn capped
-                   this effectively spread the the linear O(N)
+        by max_page_sharing. So this effectively spread the the linear
-                   computational complexity from rmap walk context
+        O(N) computational complexity from rmap walk context over
-                   over different KSM pages. The ksmd walk over the
+        different KSM pages. The ksmd walk over the stable_node
-                   stable_node "chains" is also O(N), but N is the
+        "chains" is also O(N), but N is the number of stable_node
-                   number of stable_node "dups", not the number of
+        "dups", not the number of rmap_items, so it has not a
-                   rmap_items, so it has not a significant impact on
+        significant impact on ksmd performance. In practice the best
-                   ksmd performance. In practice the best stable_node
+        stable_node "dup" candidate will be kept and found at the head
-                   "dup" candidate will be kept and found at the head
+        of the "dups" list. The higher this value the faster KSM will
-                   of the "dups" list. The higher this value the
+        merge the memory (because there will be fewer stable_node dups
-                   faster KSM will merge the memory (because there
+        queued into the stable_node chain->hlist to check for pruning)
-                   will be fewer stable_node dups queued into the
+        and the higher the deduplication factor will be, but the
-                   stable_node chain->hlist to check for pruning) and
+        slowest the worst case rmap walk could be for any given KSM
-                   the higher the deduplication factor will be, but
+        page. Slowing down the rmap_walk means there will be higher
-                   the slowest the worst case rmap walk could be for
+        latency for certain virtual memory operations happening during
-                   any given KSM page. Slowing down the rmap_walk
+        swapping, compaction, NUMA balancing and page migration, in
-                   means there will be higher latency for certain
+        turn decreasing responsiveness for the caller of those virtual
-                   virtual memory operations happening during
+        memory operations. The scheduler latency of other tasks not
-                   swapping, compaction, NUMA balancing and page
+        involved with the VM operations doing the rmap walk is not
-                   migration, in turn decreasing responsiveness for
+        affected by this parameter as the rmap walks are always
-                   the caller of those virtual memory operations. The
+        schedule friendly themselves.
-                   scheduler latency of other tasks not involved with
-                   the VM operations doing the rmap walk is not
+stable_node_chains_prune_millisecs
-                   affected by this parameter as the rmap walks are
+        How frequently to walk the whole list of stable_node "dups"
-                   always schedule friendly themselves.
+        linked in the stable_node "chains" in order to prune stale
+        stable_nodes. Smaller milllisecs values will free up the KSM
-stable_node_chains_prune_millisecs - How frequently to walk the whole
+        metadata with lower latency, but they will make ksmd use more
-                   list of stable_node "dups" linked in the
+        CPU during the scan. This only applies to the stable_node
-                   stable_node "chains" in order to prune stale
+        chains so it's a noop if not a single KSM page hit the
-                   stable_nodes. Smaller milllisecs values will free
+        max_page_sharing yet (there would be no stable_node chains in
-                   up the KSM metadata with lower latency, but they
+        such case).
-                   will make ksmd use more CPU during the scan. This
-                   only applies to the stable_node chains so it's a
+The effectiveness of KSM and MADV_MERGEABLE is shown in ``/sys/kernel/mm/ksm/``:
-                   noop if not a single KSM page hit the
-                   max_page_sharing yet (there would be no stable_node
+pages_shared
-                   chains in such case).
+        how many shared pages are being used
+pages_sharing
-The effectiveness of KSM and MADV_MERGEABLE is shown in /sys/kernel/mm/ksm/:
+        how many more sites are sharing them i.e. how much saved
+pages_unshared
-pages_shared     - how many shared pages are being used
+        how many pages unique but repeatedly checked for merging
-pages_sharing    - how many more sites are sharing them i.e. how much saved
+pages_volatile
-pages_unshared   - how many pages unique but repeatedly checked for merging
+        how many pages changing too fast to be placed in a tree
-pages_volatile   - how many pages changing too fast to be placed in a tree
+full_scans
-full_scans       - how many times all mergeable areas have been scanned
+        how many times all mergeable areas have been scanned
+stable_node_chains
-stable_node_chains - number of stable node chains allocated, this is
+        number of stable node chains allocated, this is effectively
-		     effectively the number of KSM pages that hit the
+        the number of KSM pages that hit the max_page_sharing limit
-		     max_page_sharing limit
+stable_node_dups
-stable_node_dups   - number of stable node dups queued into the
+        number of stable node dups queued into the stable_node chains
-		     stable_node chains
 A high ratio of pages_sharing to pages_shared indicates good sharing, but
 a high ratio of pages_unshared to pages_sharing indicates wasted effort.

--- a/Documentation/vm/mmu_notifier.txt
+++ b/Documentation/vm/mmu_notifier.txt
+.. _mmu_notifier:
 When do you need to notify inside page table lock ?
+===================================================
 When clearing a pte/pmd we are given a choice to notify the event through
-(notify version of *_clear_flush call mmu_notifier_invalidate_range) under
+(notify version of \*_clear_flush call mmu_notifier_invalidate_range) under
 the page table lock. But that notification is not necessary in all cases.
 For secondary TLB (non CPU TLB) like IOMMU TLB or device TLB (when device use
@@ -18,6 +21,7 @@ a page that might now be used by some completely different task.
 Case B is more subtle. For correctness it requires the following sequence to
 happen:
  - take page table lock
  - clear page table entry and notify ([pmd/pte]p_huge_clear_flush_notify())
  - set page table entry to point to new page
@@ -28,58 +32,60 @@ the device.
 Consider the following scenario (device use a feature similar to ATS/PASID):
-Two address addrA and addrB such that |addrA - addrB| >= PAGE_SIZE we assume
+Two address addrA and addrB such that \|addrA - addrB\| >= PAGE_SIZE we assume
 they are write protected for COW (other case of B apply too).
-[Time N] --------------------------------------------------------------------
+::
-CPU-thread-0  {try to write to addrA}
-CPU-thread-1  {try to write to addrB}
+ [Time N] --------------------------------------------------------------------
-CPU-thread-2  {}
+ CPU-thread-0  {try to write to addrA}
-CPU-thread-3  {}
+ CPU-thread-1  {try to write to addrB}
-DEV-thread-0  {read addrA and populate device TLB}
+ CPU-thread-2  {}
-DEV-thread-2  {read addrB and populate device TLB}
+ CPU-thread-3  {}
-[Time N+1] ------------------------------------------------------------------
+ DEV-thread-0  {read addrA and populate device TLB}
-CPU-thread-0  {COW_step0: {mmu_notifier_invalidate_range_start(addrA)}}
+ DEV-thread-2  {read addrB and populate device TLB}
-CPU-thread-1  {COW_step0: {mmu_notifier_invalidate_range_start(addrB)}}
+ [Time N+1] ------------------------------------------------------------------
-CPU-thread-2  {}
+ CPU-thread-0  {COW_step0: {mmu_notifier_invalidate_range_start(addrA)}}
-CPU-thread-3  {}
+ CPU-thread-1  {COW_step0: {mmu_notifier_invalidate_range_start(addrB)}}
-DEV-thread-0  {}
+ CPU-thread-2  {}
-DEV-thread-2  {}
+ CPU-thread-3  {}
-[Time N+2] ------------------------------------------------------------------
+ DEV-thread-0  {}
-CPU-thread-0  {COW_step1: {update page table to point to new page for addrA}}
+ DEV-thread-2  {}
-CPU-thread-1  {COW_step1: {update page table to point to new page for addrB}}
+ [Time N+2] ------------------------------------------------------------------
-CPU-thread-2  {}
+ CPU-thread-0  {COW_step1: {update page table to point to new page for addrA}}
-CPU-thread-3  {}
+ CPU-thread-1  {COW_step1: {update page table to point to new page for addrB}}
-DEV-thread-0  {}
+ CPU-thread-2  {}
-DEV-thread-2  {}
+ CPU-thread-3  {}
-[Time N+3] ------------------------------------------------------------------
+ DEV-thread-0  {}
-CPU-thread-0  {preempted}
+ DEV-thread-2  {}
-CPU-thread-1  {preempted}
+ [Time N+3] ------------------------------------------------------------------
-CPU-thread-2  {write to addrA which is a write to new page}
+ CPU-thread-0  {preempted}
-CPU-thread-3  {}
+ CPU-thread-1  {preempted}
-DEV-thread-0  {}
+ CPU-thread-2  {write to addrA which is a write to new page}
-DEV-thread-2  {}
+ CPU-thread-3  {}
-[Time N+3] ------------------------------------------------------------------
+ DEV-thread-0  {}
-CPU-thread-0  {preempted}
+ DEV-thread-2  {}
-CPU-thread-1  {preempted}
+ [Time N+3] ------------------------------------------------------------------
-CPU-thread-2  {}
+ CPU-thread-0  {preempted}
-CPU-thread-3  {write to addrB which is a write to new page}
+ CPU-thread-1  {preempted}
-DEV-thread-0  {}
+ CPU-thread-2  {}
-DEV-thread-2  {}
+ CPU-thread-3  {write to addrB which is a write to new page}
-[Time N+4] ------------------------------------------------------------------
+ DEV-thread-0  {}
-CPU-thread-0  {preempted}
+ DEV-thread-2  {}
-CPU-thread-1  {COW_step3: {mmu_notifier_invalidate_range_end(addrB)}}
+ [Time N+4] ------------------------------------------------------------------
-CPU-thread-2  {}
+ CPU-thread-0  {preempted}
-CPU-thread-3  {}
+ CPU-thread-1  {COW_step3: {mmu_notifier_invalidate_range_end(addrB)}}
-DEV-thread-0  {}
+ CPU-thread-2  {}
-DEV-thread-2  {}
+ CPU-thread-3  {}
-[Time N+5] ------------------------------------------------------------------
+ DEV-thread-0  {}
-CPU-thread-0  {preempted}
+ DEV-thread-2  {}
-CPU-thread-1  {}
+ [Time N+5] ------------------------------------------------------------------
-CPU-thread-2  {}
+ CPU-thread-0  {preempted}
-CPU-thread-3  {}
+ CPU-thread-1  {}
-DEV-thread-0  {read addrA from old page}
+ CPU-thread-2  {}
-DEV-thread-2  {read addrB from new page}
+ CPU-thread-3  {}
+ DEV-thread-0  {read addrA from old page}
+ DEV-thread-2  {read addrB from new page}
 So here because at time N+2 the clear page table entry was not pair with a
 notification to invalidate the secondary TLB, the device see the new value for

--- a/Documentation/vm/numa
+++ b/Documentation/vm/numa
+.. _numa:
 Started Nov 1999 by Kanoj Sarcar <kanoj@sgi.com>
+=============
 What is NUMA?
+=============
 This question can be answered from a couple of perspectives:  the
 hardware view and the Linux software view.
@@ -106,7 +110,7 @@ to improve NUMA locality using various CPU affinity command line interfaces,
 such as taskset(1) and numactl(1), and program interfaces such as
 sched_setaffinity(2).  Further, one can modify the kernel's default local
 allocation behavior using Linux NUMA memory policy.
-[see Documentation/vm/numa_memory_policy.txt.]
+[see Documentation/vm/numa_memory_policy.rst.]
 System administrators can restrict the CPUs and nodes' memories that a non-
 privileged user can specify in the scheduling or NUMA commands and functions

--- a/Documentation/vm/numa_memory_policy.txt
+++ b/Documentation/vm/numa_memory_policy.txt
--- a/Documentation/vm/overcommit-accounting
+++ b/Documentation/vm/overcommit-accounting
-The Linux kernel supports the following overcommit handling modes
-0	-	Heuristic overcommit handling. Obvious overcommits of
-		address space are refused. Used for a typical system. It
-		ensures a seriously wild allocation fails while allowing
-		overcommit to reduce swap usage.  root is allowed to 
-		allocate slightly more memory in this mode. This is the 
-		default.
-1	-	Always overcommit. Appropriate for some scientific
-		applications. Classic example is code using sparse arrays
-		and just relying on the virtual memory consisting almost
-		entirely of zero pages.
-2	-	Don't overcommit. The total address space commit
-		for the system is not permitted to exceed swap + a
-		configurable amount (default is 50%) of physical RAM.
-		Depending on the amount you use, in most situations
-		this means a process will not be killed while accessing
-		pages but will receive errors on memory allocation as
-		appropriate.
-		Useful for applications that want to guarantee their
-		memory allocations will be available in the future
-		without having to initialize every page.
-The overcommit policy is set via the sysctl `vm.overcommit_memory'.
-The overcommit amount can be set via `vm.overcommit_ratio' (percentage)
-or `vm.overcommit_kbytes' (absolute value).
-The current overcommit limit and amount committed are viewable in
-/proc/meminfo as CommitLimit and Committed_AS respectively.
-Gotchas
-------
-The C language stack growth does an implicit mremap. If you want absolute
-guarantees and run close to the edge you MUST mmap your stack for the 
-largest size you think you will need. For typical stack usage this does
-not matter much but it's a corner case if you really really care
-In mode 2 the MAP_NORESERVE flag is ignored. 
-How It Works
------------
-The overcommit is based on the following rules
-For a file backed map
-	SHARED or READ-only	-	0 cost (the file is the map not swap)
-	PRIVATE WRITABLE	-	size of mapping per instance
-For an anonymous or /dev/zero map
-	SHARED			-	size of mapping
-	PRIVATE READ-only	-	0 cost (but of little use)
-	PRIVATE WRITABLE	-	size of mapping per instance
-Additional accounting
-	Pages made writable copies by mmap
-	shmfs memory drawn from the same pool
-Status
------
-o	We account mmap memory mappings
-o	We account mprotect changes in commit
-o	We account mremap changes in size
-o	We account brk
-o	We account munmap
-o	We report the commit status in /proc
-o	Account and check on fork
-o	Review stack handling/building on exec
-o	SHMfs accounting
-o	Implement actual limit enforcement
-To Do
-----
-o	Account ptrace pages (this is hard)
--- a/Documentation/vm/overcommit-accounting.rst
+++ b/Documentation/vm/overcommit-accounting.rst
+.. _overcommit_accounting:
+=====================
+Overcommit Accounting
+=====================
+The Linux kernel supports the following overcommit handling modes
+0
+	Heuristic overcommit handling. Obvious overcommits of address
+	space are refused. Used for a typical system. It ensures a
+	seriously wild allocation fails while allowing overcommit to
+	reduce swap usage.  root is allowed to allocate slightly more
+	memory in this mode. This is the default.
+1
+	Always overcommit. Appropriate for some scientific
+	applications. Classic example is code using sparse arrays and
+	just relying on the virtual memory consisting almost entirely
+	of zero pages.
+2
+	Don't overcommit. The total address space commit for the
+	system is not permitted to exceed swap + a configurable amount
+	(default is 50%) of physical RAM.  Depending on the amount you
+	use, in most situations this means a process will not be
+	killed while accessing pages but will receive errors on memory
+	allocation as appropriate.
+	Useful for applications that want to guarantee their memory
+	allocations will be available in the future without having to
+	initialize every page.
+The overcommit policy is set via the sysctl ``vm.overcommit_memory``.
+The overcommit amount can be set via ``vm.overcommit_ratio`` (percentage)
+or ``vm.overcommit_kbytes`` (absolute value).
+The current overcommit limit and amount committed are viewable in
+``/proc/meminfo`` as CommitLimit and Committed_AS respectively.
+Gotchas
+=======
+The C language stack growth does an implicit mremap. If you want absolute
+guarantees and run close to the edge you MUST mmap your stack for the
+largest size you think you will need. For typical stack usage this does
+not matter much but it's a corner case if you really really care
+In mode 2 the MAP_NORESERVE flag is ignored.
+How It Works
+============
+The overcommit is based on the following rules
+For a file backed map
+	| SHARED or READ-only	-	0 cost (the file is the map not swap)
+	| PRIVATE WRITABLE	-	size of mapping per instance
+For an anonymous or ``/dev/zero`` map
+	| SHARED			-	size of mapping
+	| PRIVATE READ-only	-	0 cost (but of little use)
+	| PRIVATE WRITABLE	-	size of mapping per instance
+Additional accounting
+	| Pages made writable copies by mmap
+	| shmfs memory drawn from the same pool
+Status
+======
+*	We account mmap memory mappings
+*	We account mprotect changes in commit
+*	We account mremap changes in size
+*	We account brk
+*	We account munmap
+*	We report the commit status in /proc
+*	Account and check on fork
+*	Review stack handling/building on exec
+*	SHMfs accounting
+*	Implement actual limit enforcement
+To Do
+=====
+*	Account ptrace pages (this is hard)
--- a/Documentation/vm/page_frags
+++ b/Documentation/vm/page_frags
+.. _page_frags:
+==============
 Page fragments
--------------
+==============
 A page fragment is an arbitrary-length arbitrary-offset area of memory
 which resides within a 0 or higher order compound page.  Multiple

--- a/Documentation/vm/page_migration
+++ b/Documentation/vm/page_migration
+.. _page_migration:
+==============
 Page migration
--------------
+==============
 Page migration allows the moving of the physical location of pages between
 nodes in a numa system while the process is running. This means that the
@@ -20,7 +23,7 @@ Page migration functions are provided by the numactl package by Andi Kleen
 (a version later than 0.9.3 is required. Get it from
 ftp://oss.sgi.com/www/projects/libnuma/download/). numactl provides libnuma
 which provides an interface similar to other numa functionality for page
-migration.  cat /proc/<pid>/numa_maps allows an easy review of where the
+migration.  cat ``/proc/<pid>/numa_maps`` allows an easy review of where the
 pages of a process are located. See also the numa_maps documentation in the
 proc(5) man page.
@@ -56,8 +59,8 @@ description for those trying to use migrate_pages() from the kernel
 (for userspace usage see the Andi Kleen's numactl package mentioned above)
 and then a low level description of how the low level details work.
-A. In kernel use of migrate_pages()
+In kernel use of migrate_pages()
-----------------------------------
+================================
 1. Remove pages from the LRU.
@@ -78,8 +81,8 @@ A. In kernel use of migrate_pages()
   the new page for each page that is considered for
   moving.
-B. How migrate_pages() works
+How migrate_pages() works
----------------------------
+=========================
 migrate_pages() does several passes over its list of pages. A page is moved
 if all references to a page are removable at the time. The page has
@@ -142,8 +145,8 @@ Steps:
 20. The new page is moved to the LRU and can be scanned by the swapper
    etc again.
-C. Non-LRU page migration
+Non-LRU page migration
-------------------------
+======================
 Although original migration aimed for reducing the latency of memory access
 for NUMA, compaction who want to create high-order page is also main customer.
@@ -164,89 +167,91 @@ migration path.
 If a driver want to make own pages movable, it should define three functions
 which are function pointers of struct address_space_operations.
-1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);
+1. ``bool (*isolate_page) (struct page *page, isolate_mode_t mode);``
-What VM expects on isolate_page function of driver is to return *true*
+   What VM expects on isolate_page function of driver is to return *true*
-if driver isolates page successfully. On returing true, VM marks the page
+   if driver isolates page successfully. On returing true, VM marks the page
-as PG_isolated so concurrent isolation in several CPUs skip the page
+   as PG_isolated so concurrent isolation in several CPUs skip the page
-for isolation. If a driver cannot isolate the page, it should return *false*.
+   for isolation. If a driver cannot isolate the page, it should return *false*.
-Once page is successfully isolated, VM uses page.lru fields so driver
+   Once page is successfully isolated, VM uses page.lru fields so driver
-shouldn't expect to preserve values in that fields.
+   shouldn't expect to preserve values in that fields.
-2. int (*migratepage) (struct address_space *mapping,
+2. ``int (*migratepage) (struct address_space *mapping,``
-		struct page *newpage, struct page *oldpage, enum migrate_mode);
+|	``struct page *newpage, struct page *oldpage, enum migrate_mode);``
-After isolation, VM calls migratepage of driver with isolated page.
+   After isolation, VM calls migratepage of driver with isolated page.
-The function of migratepage is to move content of the old page to new page
+   The function of migratepage is to move content of the old page to new page
-and set up fields of struct page newpage. Keep in mind that you should
+   and set up fields of struct page newpage. Keep in mind that you should
-indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
+   indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
-under page_lock if you migrated the oldpage successfully and returns
+   under page_lock if you migrated the oldpage successfully and returns
-MIGRATEPAGE_SUCCESS. If driver cannot migrate the page at the moment, driver
+   MIGRATEPAGE_SUCCESS. If driver cannot migrate the page at the moment, driver
-can return -EAGAIN. On -EAGAIN, VM will retry page migration in a short time
+   can return -EAGAIN. On -EAGAIN, VM will retry page migration in a short time
-because VM interprets -EAGAIN as "temporal migration failure". On returning
+   because VM interprets -EAGAIN as "temporal migration failure". On returning
-any error except -EAGAIN, VM will give up the page migration without retrying
+   any error except -EAGAIN, VM will give up the page migration without retrying
-in this time.
+   in this time.
-Driver shouldn't touch page.lru field VM using in the functions.
+   Driver shouldn't touch page.lru field VM using in the functions.
-3. void (*putback_page)(struct page *);
+3. ``void (*putback_page)(struct page *);``
-If migration fails on isolated page, VM should return the isolated page
+   If migration fails on isolated page, VM should return the isolated page
-to the driver so VM calls driver's putback_page with migration failed page.
+   to the driver so VM calls driver's putback_page with migration failed page.
-In this function, driver should put the isolated page back to the own data
+   In this function, driver should put the isolated page back to the own data
-structure.
+   structure.
 4. non-lru movable page flags
-There are two page flags for supporting non-lru movable page.
+   There are two page flags for supporting non-lru movable page.
-* PG_movable
+   * PG_movable
-Driver should use the below function to make page movable under page_lock.
+     Driver should use the below function to make page movable under page_lock::
 	void __SetPageMovable(struct page *page, struct address_space *mapping)
-It needs argument of address_space for registering migration family functions
+     It needs argument of address_space for registering migration
-which will be called by VM. Exactly speaking, PG_movable is not a real flag of
+     family functions which will be called by VM. Exactly speaking,
-struct page. Rather than, VM reuses page->mapping's lower bits to represent it.
+     PG_movable is not a real flag of struct page. Rather than, VM
+     reuses page->mapping's lower bits to represent it.
+::
 	#define PAGE_MAPPING_MOVABLE 0x2
 	page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;
-so driver shouldn't access page->mapping directly. Instead, driver should
+     so driver shouldn't access page->mapping directly. Instead, driver should
-use page_mapping which mask off the low two bits of page->mapping under
+     use page_mapping which mask off the low two bits of page->mapping under
-page lock so it can get right struct address_space.
+     page lock so it can get right struct address_space.
-For testing of non-lru movable page, VM supports __PageMovable function.
+     For testing of non-lru movable page, VM supports __PageMovable function.
-However, it doesn't guarantee to identify non-lru movable page because
+     However, it doesn't guarantee to identify non-lru movable page because
-page->mapping field is unified with other variables in struct page.
+     page->mapping field is unified with other variables in struct page.
-As well, if driver releases the page after isolation by VM, page->mapping
+     As well, if driver releases the page after isolation by VM, page->mapping
-doesn't have stable value although it has PAGE_MAPPING_MOVABLE
+     doesn't have stable value although it has PAGE_MAPPING_MOVABLE
-(Look at __ClearPageMovable). But __PageMovable is cheap to catch whether
+     (Look at __ClearPageMovable). But __PageMovable is cheap to catch whether
-page is LRU or non-lru movable once the page has been isolated. Because
+     page is LRU or non-lru movable once the page has been isolated. Because
-LRU pages never can have PAGE_MAPPING_MOVABLE in page->mapping. It is also
+     LRU pages never can have PAGE_MAPPING_MOVABLE in page->mapping. It is also
-good for just peeking to test non-lru movable pages before more expensive
+     good for just peeking to test non-lru movable pages before more expensive
-checking with lock_page in pfn scanning to select victim.
+     checking with lock_page in pfn scanning to select victim.
-For guaranteeing non-lru movable page, VM provides PageMovable function.
+     For guaranteeing non-lru movable page, VM provides PageMovable function.
-Unlike __PageMovable, PageMovable functions validates page->mapping and
+     Unlike __PageMovable, PageMovable functions validates page->mapping and
-mapping->a_ops->isolate_page under lock_page. The lock_page prevents sudden
+     mapping->a_ops->isolate_page under lock_page. The lock_page prevents sudden
-destroying of page->mapping.
+     destroying of page->mapping.
-Driver using __SetPageMovable should clear the flag via __ClearMovablePage
+     Driver using __SetPageMovable should clear the flag via __ClearMovablePage
-under page_lock before the releasing the page.
+     under page_lock before the releasing the page.
-* PG_isolated
+   * PG_isolated
-To prevent concurrent isolation among several CPUs, VM marks isolated page
+     To prevent concurrent isolation among several CPUs, VM marks isolated page
-as PG_isolated under lock_page. So if a CPU encounters PG_isolated non-lru
+     as PG_isolated under lock_page. So if a CPU encounters PG_isolated non-lru
-movable page, it can skip it. Driver doesn't need to manipulate the flag
+     movable page, it can skip it. Driver doesn't need to manipulate the flag
-because VM will set/clear it automatically. Keep in mind that if driver
+     because VM will set/clear it automatically. Keep in mind that if driver
-sees PG_isolated page, it means the page have been isolated by VM so it
+     sees PG_isolated page, it means the page have been isolated by VM so it
-shouldn't touch page.lru field.
+     shouldn't touch page.lru field.
-PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
+     PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
-for own purpose.
+     for own purpose.
 Christoph Lameter, May 8, 2006.
 Minchan Kim, Mar 28, 2016.
--- a/Documentation/vm/page_owner.txt
+++ b/Documentation/vm/page_owner.txt
+.. _page_owner:
+==================================================
 page owner: Tracking about who allocated each page
-----------------------------------------------------------
+==================================================
-* Introduction
+Introduction
+============
 page owner is for the tracking about who allocated each page.
 It can be used to debug memory leak or to find a memory hogger.
@@ -34,11 +38,13 @@ not affect to allocation performance, especially if the static keys jump
 label patching functionality is available. Following is the kernel's code
 size change due to this facility.
- Without page owner
+- Without page owner::
   text    data     bss     dec     hex filename
   40662   1493     644   42799    a72f mm/page_alloc.o
- With page owner
+- With page owner::
   text    data     bss     dec     hex filename
   40892   1493     644   43029    a815 mm/page_alloc.o
   1427      24       8    1459     5b3 mm/page_ext.o
@@ -62,21 +68,23 @@ are catched and marked, although they are mostly allocated from struct
 page extension feature. Anyway, after that, no page is left in
 un-tracking state.
-* Usage
+Usage
+=====
+1) Build user-space helper::
-1) Build user-space helper
 	cd tools/vm
 	make page_owner_sort
-2) Enable page owner
+2) Enable page owner: add "page_owner=on" to boot cmdline.
-	Add "page_owner=on" to boot cmdline.
 3) Do the job what you want to debug
-4) Analyze information from page owner
+4) Analyze information from page owner::
 	cat /sys/kernel/debug/page_owner > page_owner_full.txt
 	grep -v ^PFN page_owner_full.txt > page_owner.txt
 	./page_owner_sort page_owner.txt sorted_page_owner.txt
   See the result about who allocated each page
-	in the sorted_page_owner.txt.
+   in the ``sorted_page_owner.txt``.
--- a/Documentation/vm/pagemap.txt
+++ b/Documentation/vm/pagemap.txt
-pagemap, from the userspace perspective
+.. _pagemap:
---------------------------------------
+======================================
+pagemap from the Userspace Perspective
+======================================
 pagemap is a new (as of 2.6.25) set of interfaces in the kernel that allow
 userspace programs to examine the page tables and related information by
-reading files in /proc.
+reading files in ``/proc``.
 There are four components to pagemap:
- * /proc/pid/pagemap.  This file lets a userspace process find out which
+ * ``/proc/pid/pagemap``.  This file lets a userspace process find out which
   physical frame each virtual page is mapped to.  It contains one 64-bit
   value for each virtual page, containing the following data (from
   fs/proc/task_mmu.c, above pagemap_read):
@@ -15,7 +18,7 @@ There are four components to pagemap:
    * Bits 0-54  page frame number (PFN) if present
    * Bits 0-4   swap type if swapped
    * Bits 5-54  swap offset if swapped
-    * Bit  55    pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
+    * Bit  55    pte is soft-dirty (see Documentation/vm/soft-dirty.rst)
    * Bit  56    page exclusively mapped (since 4.2)
    * Bits 57-60 zero
    * Bit  61    page is file-page or shared-anon (since 3.5)
@@ -37,13 +40,13 @@ There are four components to pagemap:
   determine which areas of memory are actually mapped and llseek to
   skip over unmapped regions.
- * /proc/kpagecount.  This file contains a 64-bit count of the number of
+ * ``/proc/kpagecount``.  This file contains a 64-bit count of the number of
   times each page is mapped, indexed by PFN.
- * /proc/kpageflags.  This file contains a 64-bit set of flags for each
+ * ``/proc/kpageflags``.  This file contains a 64-bit set of flags for each
   page, indexed by PFN.
-   The flags are (from fs/proc/page.c, above kpageflags_read):
+   The flags are (from ``fs/proc/page.c``, above kpageflags_read):
    0. LOCKED
    1. ERROR
@@ -72,98 +75,108 @@ There are four components to pagemap:
    24. ZERO_PAGE
    25. IDLE
- * /proc/kpagecgroup.  This file contains a 64-bit inode number of the
+ * ``/proc/kpagecgroup``.  This file contains a 64-bit inode number of the
   memory cgroup each page is charged to, indexed by PFN. Only available when
   CONFIG_MEMCG is set.
 Short descriptions to the page flags:
+=====================================
- 0. LOCKED
+0 - LOCKED
   page is being locked for exclusive access, eg. by undergoing read/write IO
+7 - SLAB
- 7. SLAB
   page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
   When compound page is used, SLUB/SLQB will only set this flag on the head
   page; SLOB will not flag it at all.
+10 - BUDDY
-10. BUDDY
    a free memory block managed by the buddy system allocator
    The buddy system organizes free memory in blocks of various orders.
    An order N block has 2^N physically contiguous pages, with the BUDDY flag
    set for and _only_ for the first page.
+15 - COMPOUND_HEAD
-15. COMPOUND_HEAD
-16. COMPOUND_TAIL
    A compound page with order N consists of 2^N physically contiguous pages.
    A compound page with order 2 takes the form of "HTTT", where H donates its
    head page and T donates its tail page(s).  The major consumers of compound
-    pages are hugeTLB pages (Documentation/vm/hugetlbpage.txt), the SLUB etc.
+    pages are hugeTLB pages (Documentation/vm/hugetlbpage.rst), the SLUB etc.
    memory allocators and various device drivers. However in this interface,
    only huge/giga pages are made visible to end users.
-17. HUGE
+16 - COMPOUND_TAIL
+    A compound page tail (see description above).
+17 - HUGE
    this is an integral part of a HugeTLB page
+19 - HWPOISON
-19. HWPOISON
    hardware detected memory corruption on this page: don't touch the data!
+20 - NOPAGE
-20. NOPAGE
    no page frame exists at the requested address
+21 - KSM
-21. KSM
    identical memory pages dynamically shared between one or more processes
+22 - THP
-22. THP
    contiguous pages which construct transparent hugepages
+23 - BALLOON
-23. BALLOON
    balloon compaction page
+24 - ZERO_PAGE
-24. ZERO_PAGE
    zero page for pfn_zero or huge_zero page
+25 - IDLE
-25. IDLE
    page has not been accessed since it was marked idle (see
-    Documentation/vm/idle_page_tracking.txt). Note that this flag may be
+    Documentation/vm/idle_page_tracking.rst). Note that this flag may be
    stale in case the page was accessed via a PTE. To make sure the flag
-    is up-to-date one has to read /sys/kernel/mm/page_idle/bitmap first.
+    is up-to-date one has to read ``/sys/kernel/mm/page_idle/bitmap`` first.
+IO related page flags
+---------------------
-    [IO related page flags]
+1 - ERROR
- 1. ERROR     IO error occurred
+   IO error occurred
- 3. UPTODATE  page has up-to-date data
+3 - UPTODATE
+   page has up-to-date data
   ie. for file backed page: (in-memory data revision >= on-disk one)
- 4. DIRTY     page has been written to, hence contains new data
+4 - DIRTY
+   page has been written to, hence contains new data
   ie. for file backed page: (in-memory data revision >  on-disk one)
- 8. WRITEBACK page is being synced to disk
+8 - WRITEBACK
+   page is being synced to disk
-    [LRU related page flags]
- 5. LRU         page is in one of the LRU lists
+LRU related page flags
- 6. ACTIVE      page is in the active LRU list
+----------------------
-18. UNEVICTABLE page is in the unevictable (non-)LRU list
-                It is somehow pinned and not a candidate for LRU page reclaims,
+5 - LRU
-		eg. ramfs pages, shmctl(SHM_LOCK) and mlock() memory segments
+   page is in one of the LRU lists
- 2. REFERENCED  page has been referenced since last LRU list enqueue/requeue
+6 - ACTIVE
- 9. RECLAIM     page will be reclaimed soon after its pageout IO completed
+   page is in the active LRU list
-11. MMAP        a memory mapped page
+18 - UNEVICTABLE
-12. ANON        a memory mapped page that is not part of a file
+   page is in the unevictable (non-)LRU list It is somehow pinned and
-13. SWAPCACHE   page is mapped to swap space, ie. has an associated swap entry
+   not a candidate for LRU page reclaims, eg. ramfs pages,
-14. SWAPBACKED  page is backed by swap/RAM
+   shmctl(SHM_LOCK) and mlock() memory segments
+2 - REFERENCED
+   page has been referenced since last LRU list enqueue/requeue
+9 - RECLAIM
+   page will be reclaimed soon after its pageout IO completed
+11 - MMAP
+   a memory mapped page
+12 - ANON
+   a memory mapped page that is not part of a file
+13 - SWAPCACHE
+   page is mapped to swap space, ie. has an associated swap entry
+14 - SWAPBACKED
+   page is backed by swap/RAM
 The page-types tool in the tools/vm directory can be used to query the
 above flags.
-Using pagemap to do something useful:
+Using pagemap to do something useful
+====================================
 The general procedure for using pagemap to find out about a process' memory
 usage goes like this:
- 1. Read /proc/pid/maps to determine which parts of the memory space are
+ 1. Read ``/proc/pid/maps`` to determine which parts of the memory space are
    mapped to what.
 2. Select the maps you are interested in -- all of them, or a particular
    library, or the stack or the heap, etc.
- 3. Open /proc/pid/pagemap and seek to the pages you would like to examine.
+ 3. Open ``/proc/pid/pagemap`` and seek to the pages you would like to examine.
 4. Read a u64 for each page from pagemap.
- 5. Open /proc/kpagecount and/or /proc/kpageflags.  For each PFN you just
+ 5. Open ``/proc/kpagecount`` and/or ``/proc/kpageflags``.  For each PFN you
-    read, seek to that entry in the file, and read the data you want.
+    just read, seek to that entry in the file, and read the data you want.
 For example, to find the "unique set size" (USS), which is the amount of
 memory that a process is using that is not shared with any other process,
@@ -171,7 +184,8 @@ you can go through every map in the process, find the PFNs, look those up
 in kpagecount, and tally up the number of pages that are only referenced
 once.
-Other notes:
+Other notes
+===========
 Reading from any of the files will return -EINVAL if you are not starting
 the read on an 8-byte boundary (e.g., if you sought an odd number of bytes

--- a/Documentation/vm/remap_file_pages.txt
+++ b/Documentation/vm/remap_file_pages.txt
+.. _remap_file_pages:
+==============================
+remap_file_pages() system call
+==============================
 The remap_file_pages() system call is used to create a nonlinear mapping,
 that is, a mapping in which the pages of the file are mapped into a
 nonsequential order in memory. The advantage of using remap_file_pages()

--- a/Documentation/vm/slub.txt
+++ b/Documentation/vm/slub.txt
+.. _slub:
+==========================
 Short users guide for SLUB
--------------------------
+==========================
 The basic philosophy of SLUB is very different from SLAB. SLAB
 requires rebuilding the kernel to activate debug options for all
@@ -8,18 +11,19 @@ SLUB can enable debugging only for selected slabs in order to avoid
 an impact on overall system performance which may make a bug more
 difficult to find.
-In order to switch debugging on one can add an option "slub_debug"
+In order to switch debugging on one can add an option ``slub_debug``
 to the kernel command line. That will enable full debugging for
 all slabs.
-Typically one would then use the "slabinfo" command to get statistical
+Typically one would then use the ``slabinfo`` command to get statistical
-data and perform operation on the slabs. By default slabinfo only lists
+data and perform operation on the slabs. By default ``slabinfo`` only lists
 slabs that have data in them. See "slabinfo -h" for more options when
-running the command. slabinfo can be compiled with
+running the command. ``slabinfo`` can be compiled with
+::
-gcc -o slabinfo tools/vm/slabinfo.c
+	gcc -o slabinfo tools/vm/slabinfo.c
-Some of the modes of operation of slabinfo require that slub debugging
+Some of the modes of operation of ``slabinfo`` require that slub debugging
 be enabled on the command line. F.e. no tracking information will be
 available without debugging on and validation can only partially
 be performed if debugging was not switched on.
@@ -27,14 +31,17 @@ be performed if debugging was not switched on.
 Some more sophisticated uses of slub_debug:
 -------------------------------------------
-Parameters may be given to slub_debug. If none is specified then full
+Parameters may be given to ``slub_debug``. If none is specified then full
 debugging is enabled. Format:
-slub_debug=<Debug-Options>       Enable options for all slabs
+slub_debug=<Debug-Options>
+	Enable options for all slabs
 slub_debug=<Debug-Options>,<slab name>
 	Enable options only for select slabs
-Possible debug options are
+Possible debug options are::
 	F		Sanity checks on (enables SLAB_DEBUG_CONSISTENCY_CHECKS
 			Sorry SLAB legacy issues)
 	Z		Red zoning
@@ -47,18 +54,18 @@ Possible debug options are
 	-		Switch all debugging off (useful if the kernel is
 			configured with CONFIG_SLUB_DEBUG_ON)
-F.e. in order to boot just with sanity checks and red zoning one would specify:
+F.e. in order to boot just with sanity checks and red zoning one would specify::
 	slub_debug=FZ
-Trying to find an issue in the dentry cache? Try
+Trying to find an issue in the dentry cache? Try::
 	slub_debug=,dentry
 to only enable debugging on the dentry cache.
 Red zoning and tracking may realign the slab.  We can just apply sanity checks
-to the dentry cache with
+to the dentry cache with::
 	slub_debug=F,dentry
@@ -66,15 +73,15 @@ Debugging options may require the minimum possible slab order to increase as
 a result of storing the metadata (for example, caches with PAGE_SIZE object
 sizes).  This has a higher liklihood of resulting in slab allocation errors
 in low memory situations or if there's high fragmentation of memory.  To
-switch off debugging for such caches by default, use
+switch off debugging for such caches by default, use::
 	slub_debug=O
 In case you forgot to enable debugging on the kernel command line: It is
 possible to enable debugging manually when the kernel is up. Look at the
-contents of:
+contents of::
-/sys/kernel/slab/<slab name>/
+	/sys/kernel/slab/<slab name>/
 Look at the writable files. Writing 1 to them will enable the
 corresponding debug option. All options can be set on a slab that does
@@ -86,71 +93,76 @@ Careful with tracing: It may spew out lots of information and never stop if
 used on the wrong slab.
 Slab merging
------------
+============
 If no debug options are specified then SLUB may merge similar slabs together
 in order to reduce overhead and increase cache hotness of objects.
-slabinfo -a displays which slabs were merged together.
+``slabinfo -a`` displays which slabs were merged together.
 Slab validation
---------------
+===============
 SLUB can validate all object if the kernel was booted with slub_debug. In
-order to do so you must have the slabinfo tool. Then you can do
+order to do so you must have the ``slabinfo`` tool. Then you can do
+::
-slabinfo -v
+	slabinfo -v
 which will test all objects. Output will be generated to the syslog.
 This also works in a more limited way if boot was without slab debug.
-In that case slabinfo -v simply tests all reachable objects. Usually
+In that case ``slabinfo -v`` simply tests all reachable objects. Usually
 these are in the cpu slabs and the partial slabs. Full slabs are not
 tracked by SLUB in a non debug situation.
 Getting more performance
------------------------
+========================
 To some degree SLUB's performance is limited by the need to take the
 list_lock once in a while to deal with partial slabs. That overhead is
 governed by the order of the allocation for each slab. The allocations
 can be influenced by kernel parameters:
-slub_min_objects=x		(default 4)
+.. slub_min_objects=x		(default 4)
-slub_min_order=x		(default 0)
+.. slub_min_order=x		(default 0)
-slub_max_order=x		(default 3 (PAGE_ALLOC_COSTLY_ORDER))
+.. slub_max_order=x		(default 3 (PAGE_ALLOC_COSTLY_ORDER))
-slub_min_objects allows to specify how many objects must at least fit
+``slub_min_objects``
-into one slab in order for the allocation order to be acceptable.
+	allows to specify how many objects must at least fit into one
-In general slub will be able to perform this number of allocations
+	slab in order for the allocation order to be acceptable.  In
-on a slab without consulting centralized resources (list_lock) where
+	general slub will be able to perform this number of
-contention may occur.
+	allocations on a slab without consulting centralized resources
+	(list_lock) where contention may occur.
-slub_min_order specifies a minim order of slabs. A similar effect like
-slub_min_objects.
+``slub_min_order``
+	specifies a minim order of slabs. A similar effect like
-slub_max_order specified the order at which slub_min_objects should no
+	``slub_min_objects``.
-longer be checked. This is useful to avoid SLUB trying to generate
-super large order pages to fit slub_min_objects of a slab cache with
+``slub_max_order``
-large object sizes into one high order page. Setting command line
+	specified the order at which ``slub_min_objects`` should no
-parameter debug_guardpage_minorder=N (N > 0), forces setting
+	longer be checked. This is useful to avoid SLUB trying to
-slub_max_order to 0, what cause minimum possible order of slabs
+	generate super large order pages to fit ``slub_min_objects``
-allocation.
+	of a slab cache with large object sizes into one high order
+	page. Setting command line parameter
+	``debug_guardpage_minorder=N`` (N > 0), forces setting
+	``slub_max_order`` to 0, what cause minimum possible order of
+	slabs allocation.
 SLUB Debug output
-----------------
+=================
-Here is a sample of slub debug output:
+Here is a sample of slub debug output::
-====================================================================
+ ====================================================================
-BUG kmalloc-8: Redzone overwritten
+ BUG kmalloc-8: Redzone overwritten
--------------------------------------------------------------------
+ --------------------------------------------------------------------
-INFO: 0xc90f6d28-0xc90f6d2b. First byte 0x00 instead of 0xcc
+ INFO: 0xc90f6d28-0xc90f6d2b. First byte 0x00 instead of 0xcc
-INFO: Slab 0xc528c530 flags=0x400000c3 inuse=61 fp=0xc90f6d58
+ INFO: Slab 0xc528c530 flags=0x400000c3 inuse=61 fp=0xc90f6d58
-INFO: Object 0xc90f6d20 @offset=3360 fp=0xc90f6d58
+ INFO: Object 0xc90f6d20 @offset=3360 fp=0xc90f6d58
-INFO: Allocated in get_modalias+0x61/0xf5 age=53 cpu=1 pid=554
+ INFO: Allocated in get_modalias+0x61/0xf5 age=53 cpu=1 pid=554
-Bytes b4 0xc90f6d10:  00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
+ Bytes b4 0xc90f6d10:  00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
   Object 0xc90f6d20:  31 30 31 39 2e 30 30 35                         1019.005
  Redzone 0xc90f6d28:  00 cc cc cc                                     .
  Padding 0xc90f6d50:  5a 5a 5a 5a 5a 5a 5a 5a                         ZZZZZZZZ
@@ -177,7 +189,7 @@ Bytes b4 0xc90f6d10:  00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZ
   [<b7f7b410>] 0xb7f7b410
   =======================
-FIX kmalloc-8: Restoring Redzone 0xc90f6d28-0xc90f6d2b=0xcc
+ FIX kmalloc-8: Restoring Redzone 0xc90f6d28-0xc90f6d2b=0xcc
 If SLUB encounters a corrupted object (full detection requires the kernel
 to be booted with slub_debug) then the following output will be dumped
@@ -185,38 +197,38 @@ into the syslog:
 1. Description of the problem encountered
-This will be a message in the system log starting with
+   This will be a message in the system log starting with::
-===============================================
+     ===============================================
-BUG <slab cache affected>: <What went wrong>
+     BUG <slab cache affected>: <What went wrong>
-----------------------------------------------
+     -----------------------------------------------
-INFO: <corruption start>-<corruption_end> <more info>
+     INFO: <corruption start>-<corruption_end> <more info>
-INFO: Slab <address> <slab information>
+     INFO: Slab <address> <slab information>
-INFO: Object <address> <object information>
+     INFO: Object <address> <object information>
-INFO: Allocated in <kernel function> age=<jiffies since alloc> cpu=<allocated by
+     INFO: Allocated in <kernel function> age=<jiffies since alloc> cpu=<allocated by
 	cpu> pid=<pid of the process>
-INFO: Freed in <kernel function> age=<jiffies since free> cpu=<freed by cpu>
+     INFO: Freed in <kernel function> age=<jiffies since free> cpu=<freed by cpu>
 	pid=<pid of the process>
-(Object allocation / free information is only available if SLAB_STORE_USER is
+   (Object allocation / free information is only available if SLAB_STORE_USER is
-set for the slab. slub_debug sets that option)
+   set for the slab. slub_debug sets that option)
 2. The object contents if an object was involved.
-Various types of lines can follow the BUG SLUB line:
+   Various types of lines can follow the BUG SLUB line:
-Bytes b4 <address> : <bytes>
+   Bytes b4 <address> : <bytes>
 	Shows a few bytes before the object where the problem was detected.
 	Can be useful if the corruption does not stop with the start of the
 	object.
-Object <address> : <bytes>
+   Object <address> : <bytes>
 	The bytes of the object. If the object is inactive then the bytes
 	typically contain poison values. Any non-poison value shows a
 	corruption by a write after free.
-Redzone <address> : <bytes>
+   Redzone <address> : <bytes>
 	The Redzone following the object. The Redzone is used to detect
 	writes after the object. All bytes should always have the same
 	value. If there is any deviation then it is due to a write after
@@ -225,7 +237,7 @@ Redzone <address> : <bytes>
 	(Redzone information is only available if SLAB_RED_ZONE is set.
 	slub_debug sets that option)
-Padding <address> : <bytes>
+   Padding <address> : <bytes>
 	Unused data to fill up the space in order to get the next object
 	properly aligned. In the debug case we make sure that there are
 	at least 4 bytes of padding. This allows the detection of writes
@@ -233,29 +245,29 @@ Padding <address> : <bytes>
 3. A stackdump
-The stackdump describes the location where the error was detected. The cause
+   The stackdump describes the location where the error was detected. The cause
-of the corruption is may be more likely found by looking at the function that
+   of the corruption is may be more likely found by looking at the function that
-allocated or freed the object.
+   allocated or freed the object.
 4. Report on how the problem was dealt with in order to ensure the continued
-operation of the system.
+   operation of the system.
-These are messages in the system log beginning with
+   These are messages in the system log beginning with::
-FIX <slab cache affected>: <corrective action taken>
+	FIX <slab cache affected>: <corrective action taken>
-In the above sample SLUB found that the Redzone of an active object has
+   In the above sample SLUB found that the Redzone of an active object has
-been overwritten. Here a string of 8 characters was written into a slab that
+   been overwritten. Here a string of 8 characters was written into a slab that
-has the length of 8 characters. However, a 8 character string needs a
+   has the length of 8 characters. However, a 8 character string needs a
-terminating 0. That zero has overwritten the first byte of the Redzone field.
+   terminating 0. That zero has overwritten the first byte of the Redzone field.
-After reporting the details of the issue encountered the FIX SLUB message
+   After reporting the details of the issue encountered the FIX SLUB message
-tells us that SLUB has restored the Redzone to its proper value and then
+   tells us that SLUB has restored the Redzone to its proper value and then
-system operations continue.
+   system operations continue.
-Emergency operations:
+Emergency operations
---------------------
+====================
-Minimal debugging (sanity checks alone) can be enabled by booting with
+Minimal debugging (sanity checks alone) can be enabled by booting with::
 	slub_debug=F
@@ -270,73 +282,80 @@ No guarantees. The kernel component still needs to be fixed. Performance
 may be optimized further by locating the slab that experiences corruption
 and enabling debugging only for that cache
-I.e.
+I.e.::
 	slub_debug=F,dentry
 If the corruption occurs by writing after the end of the object then it
 may be advisable to enable a Redzone to avoid corrupting the beginning
-of other objects.
+of other objects::
 	slub_debug=FZ,dentry
 Extended slabinfo mode and plotting
-----------------------------------
+===================================
-The slabinfo tool has a special 'extended' ('-X') mode that includes:
+The ``slabinfo`` tool has a special 'extended' ('-X') mode that includes:
 - Slabcache Totals
 - Slabs sorted by size (up to -N <num> slabs, default 1)
 - Slabs sorted by loss (up to -N <num> slabs, default 1)
-Additionally, in this mode slabinfo does not dynamically scale sizes (G/M/K)
+Additionally, in this mode ``slabinfo`` does not dynamically scale
-and reports everything in bytes (this functionality is also available to
+sizes (G/M/K) and reports everything in bytes (this functionality is
-other slabinfo modes via '-B' option) which makes reporting more precise and
+also available to other slabinfo modes via '-B' option) which makes
-accurate. Moreover, in some sense the `-X' mode also simplifies the analysis
+reporting more precise and accurate. Moreover, in some sense the `-X'
-of slabs' behaviour, because its output can be plotted using the
+mode also simplifies the analysis of slabs' behaviour, because its
-slabinfo-gnuplot.sh script. So it pushes the analysis from looking through
+output can be plotted using the ``slabinfo-gnuplot.sh`` script. So it
-the numbers (tons of numbers) to something easier -- visual analysis.
+pushes the analysis from looking through the numbers (tons of numbers)
+to something easier -- visual analysis.
 To generate plots:
-a) collect slabinfo extended records, for example:
+a) collect slabinfo extended records, for example::
 	while [ 1 ]; do slabinfo -X >> FOO_STATS; sleep 1; done
-b) pass stats file(-s) to slabinfo-gnuplot.sh script:
+b) pass stats file(-s) to ``slabinfo-gnuplot.sh`` script::
 	slabinfo-gnuplot.sh FOO_STATS [FOO_STATS2 .. FOO_STATSN]
-The slabinfo-gnuplot.sh script will pre-processes the collected records
+   The ``slabinfo-gnuplot.sh`` script will pre-processes the collected records
-and generates 3 png files (and 3 pre-processing cache files) per STATS
+   and generates 3 png files (and 3 pre-processing cache files) per STATS
-file:
+   file:
   - Slabcache Totals: FOO_STATS-totals.png
   - Slabs sorted by size: FOO_STATS-slabs-by-size.png
   - Slabs sorted by loss: FOO_STATS-slabs-by-loss.png
-Another use case, when slabinfo-gnuplot can be useful, is when you need
+Another use case, when ``slabinfo-gnuplot.sh`` can be useful, is when you
-to compare slabs' behaviour "prior to" and "after" some code modification.
+need to compare slabs' behaviour "prior to" and "after" some code
-To help you out there, slabinfo-gnuplot.sh script can 'merge' the
+modification.  To help you out there, ``slabinfo-gnuplot.sh`` script
-`Slabcache Totals` sections from different measurements. To visually
+can 'merge' the `Slabcache Totals` sections from different
-compare N plots:
+measurements. To visually compare N plots:
+a) Collect as many STATS1, STATS2, .. STATSN files as you need::
-a) Collect as many STATS1, STATS2, .. STATSN files as you need
 	while [ 1 ]; do slabinfo -X >> STATS<X>; sleep 1; done
-b) Pre-process those STATS files
+b) Pre-process those STATS files::
 	slabinfo-gnuplot.sh STATS1 STATS2 .. STATSN
-c) Execute slabinfo-gnuplot.sh in '-t' mode, passing all of the
+c) Execute ``slabinfo-gnuplot.sh`` in '-t' mode, passing all of the
-generated pre-processed *-totals
+   generated pre-processed \*-totals::
 	slabinfo-gnuplot.sh -t STATS1-totals STATS2-totals .. STATSN-totals
-This will produce a single plot (png file).
+   This will produce a single plot (png file).
+   Plots, expectedly, can be large so some fluctuations or small spikes
+   can go unnoticed. To deal with that, ``slabinfo-gnuplot.sh`` has two
+   options to 'zoom-in'/'zoom-out':
-Plots, expectedly, can be large so some fluctuations or small spikes
+   a) ``-s %d,%d`` -- overwrites the default image width and heigh
-can go unnoticed. To deal with that, `slabinfo-gnuplot.sh' has two
+   b) ``-r %d,%d`` -- specifies a range of samples to use (for example,
-options to 'zoom-in'/'zoom-out':
+      in ``slabinfo -X >> FOO_STATS; sleep 1;`` case, using a ``-r
- a) -s %d,%d  overwrites the default image width and heigh
+      40,60`` range will plot only samples collected between 40th and
- b) -r %d,%d  specifies a range of samples to use (for example,
+      60th seconds).
-              in `slabinfo -X >> FOO_STATS; sleep 1;' case, using
-              a "-r 40,60" range will plot only samples collected
-              between 40th and 60th seconds).
 Christoph Lameter, May 30, 2007
 Sergey Senozhatsky, October 23, 2015
--- a/Documentation/vm/soft-dirty.txt
+++ b/Documentation/vm/soft-dirty.txt
-                            SOFT-DIRTY PTEs
+.. _soft_dirty:
-  The soft-dirty is a bit on a PTE which helps to track which pages a task
+===============
+Soft-Dirty PTEs
+===============
+The soft-dirty is a bit on a PTE which helps to track which pages a task
 writes to. In order to do this tracking one should
  1. Clear soft-dirty bits from the task's PTEs.
-     This is done by writing "4" into the /proc/PID/clear_refs file of the
+     This is done by writing "4" into the ``/proc/PID/clear_refs`` file of the
     task in question.
  2. Wait some time.
  3. Read soft-dirty bits from the PTEs.
-     This is done by reading from the /proc/PID/pagemap. The bit 55 of the
+     This is done by reading from the ``/proc/PID/pagemap``. The bit 55 of the
     64-bit qword is the soft-dirty one. If set, the respective PTE was
     written to since step 1.
-  Internally, to do this tracking, the writable bit is cleared from PTEs
+Internally, to do this tracking, the writable bit is cleared from PTEs
 when the soft-dirty bit is cleared. So, after this, when the task tries to
 modify a page at some virtual address the #PF occurs and the kernel sets
 the soft-dirty bit on the respective PTE.
-  Note, that although all the task's address space is marked as r/o after the
+Note, that although all the task's address space is marked as r/o after the
 soft-dirty bits clear, the #PF-s that occur after that are processed fast.
 This is so, since the pages are still mapped to physical memory, and thus all
 the kernel does is finds this fact out and puts both writable and soft-dirty
 bits on the PTE.
-  While in most cases tracking memory changes by #PF-s is more than enough
+While in most cases tracking memory changes by #PF-s is more than enough
 there is still a scenario when we can lose soft dirty bits -- a task
 unmaps a previously mapped memory region and then maps a new one at exactly
 the same place. When unmap is called, the kernel internally clears PTE values
@@ -36,7 +40,7 @@ including soft dirty bits. To notify user space application about such
 memory region renewal the kernel always marks new memory regions (and
 expanded regions) as soft dirty.
-  This feature is actively used by the checkpoint-restore project. You
+This feature is actively used by the checkpoint-restore project. You
 can find more details about it on http://criu.org

--- a/Documentation/vm/split_page_table_lock
+++ b/Documentation/vm/split_page_table_lock
+.. _split_page_table_lock:
+=====================
 Split page table lock
 =====================
@@ -11,6 +14,7 @@ access to the table. At the moment we use split lock for PTE and PMD
 tables. Access to higher level tables protected by mm->page_table_lock.
 There are helpers to lock/unlock a table and other accessor functions:
 - pte_offset_map_lock()
 	maps pte and takes PTE table lock, returns pointer to the taken
 	lock;
@@ -34,12 +38,13 @@ Split page table lock for PMD tables is enabled, if it's enabled for PTE
 tables and the architecture supports it (see below).
 Hugetlb and split page table lock
---------------------------------
+=================================
 Hugetlb can support several page sizes. We use split lock only for PMD
 level, but not for PUD.
 Hugetlb-specific helpers:
 - huge_pte_lock()
 	takes pmd split lock for PMD_SIZE page, mm->page_table_lock
 	otherwise;
@@ -47,7 +52,7 @@ Hugetlb-specific helpers:
 	returns pointer to table lock;
 Support of split page table lock by an architecture
---------------------------------------------------
+===================================================
 There's no need in special enabling of PTE split page table lock:
 everything required is done by pgtable_page_ctor() and pgtable_page_dtor(),
@@ -73,7 +78,7 @@ NOTE: pgtable_page_ctor() and pgtable_pmd_page_ctor() can fail -- it must
 be handled properly.
 page->ptl
---------
+=========
 page->ptl is used to access split page table lock, where 'page' is struct
 page of page containing the table. It shares storage with page->private
@@ -81,6 +86,7 @@ page of page containing the table. It shares storage with page->private
 To avoid increasing size of struct page and have best performance, we use a
 trick:
 - if spinlock_t fits into long, we use page->ptr as spinlock, so we
   can avoid indirect access and save a cache line.
 - if size of spinlock_t is bigger then size of long, we use page->ptl as

--- a/Documentation/vm/swap_numa.txt
+++ b/Documentation/vm/swap_numa.txt
+.. _swap_numa:
+===========================================
 Automatically bind swap device to numa node
-------------------------------------------
+===========================================
 If the system has more than one swap device and swap device has the node
 information, we can make use of this information to decide which swap
@@ -7,15 +10,16 @@ device to use in get_swap_pages() to get better performance.
 How to use this feature
-----------------------
+=======================
 Swap device has priority and that decides the order of it to be used. To make
 use of automatically binding, there is no need to manipulate priority settings
 for swap devices. e.g. on a 2 node machine, assume 2 swap devices swapA and
 swapB, with swapA attached to node 0 and swapB attached to node 1, are going
-to be swapped on. Simply swapping them on by doing:
+to be swapped on. Simply swapping them on by doing::
-# swapon /dev/swapA
-# swapon /dev/swapB
+	# swapon /dev/swapA
+	# swapon /dev/swapB
 Then node 0 will use the two swap devices in the order of swapA then swapB and
 node 1 will use the two swap devices in the order of swapB then swapA. Note
@@ -24,32 +28,39 @@ that the order of them being swapped on doesn't matter.
 A more complex example on a 4 node machine. Assume 6 swap devices are going to
 be swapped on: swapA and swapB are attached to node 0, swapC is attached to
 node 1, swapD and swapE are attached to node 2 and swapF is attached to node3.
-The way to swap them on is the same as above:
+The way to swap them on is the same as above::
-# swapon /dev/swapA
-# swapon /dev/swapB
+	# swapon /dev/swapA
-# swapon /dev/swapC
+	# swapon /dev/swapB
-# swapon /dev/swapD
+	# swapon /dev/swapC
-# swapon /dev/swapE
+	# swapon /dev/swapD
-# swapon /dev/swapF
+	# swapon /dev/swapE
+	# swapon /dev/swapF
-Then node 0 will use them in the order of:
-swapA/swapB -> swapC -> swapD -> swapE -> swapF
+Then node 0 will use them in the order of::
+	swapA/swapB -> swapC -> swapD -> swapE -> swapF
 swapA and swapB will be used in a round robin mode before any other swap device.
-node 1 will use them in the order of:
+node 1 will use them in the order of::
-swapC -> swapA -> swapB -> swapD -> swapE -> swapF
+	swapC -> swapA -> swapB -> swapD -> swapE -> swapF
+node 2 will use them in the order of::
+	swapD/swapE -> swapA -> swapB -> swapC -> swapF
-node 2 will use them in the order of:
-swapD/swapE -> swapA -> swapB -> swapC -> swapF
 Similaly, swapD and swapE will be used in a round robin mode before any
 other swap devices.
-node 3 will use them in the order of:
+node 3 will use them in the order of::
-swapF -> swapA -> swapB -> swapC -> swapD -> swapE
+	swapF -> swapA -> swapB -> swapC -> swapD -> swapE
 Implementation details
----------------------
+======================
 The current code uses a priority based list, swap_avail_list, to decide
 which swap device to use and if multiple swap devices share the same

--- a/Documentation/vm/transhuge.txt
+++ b/Documentation/vm/transhuge.txt
--- a/Documentation/vm/unevictable-lru.txt
+++ b/Documentation/vm/unevictable-lru.txt
--- a/Documentation/vm/userfaultfd.txt
+++ b/Documentation/vm/userfaultfd.txt
-= Userfaultfd =
+.. _userfaultfd:
-== Objective ==
+===========
+Userfaultfd
+===========
+Objective
+=========
 Userfaults allow the implementation of on-demand paging from userland
 and more generally they allow userland to take control of various
@@ -9,7 +14,8 @@ memory page faults, something otherwise only the kernel code could do.
 For example userfaults allows a proper and more optimal implementation
 of the PROT_NONE+SIGSEGV trick.
-== Design ==
+Design
+======
 Userfaults are delivered and resolved through the userfaultfd syscall.
@@ -41,7 +47,8 @@ different processes without them being aware about what is going on
 themselves on the same region the manager is already tracking, which
 is a corner case that would currently return -EBUSY).
-== API ==
+API
+===
 When first opened the userfaultfd must be enabled invoking the
 UFFDIO_API ioctl specifying a uffdio_api.api value set to UFFD_API (or
@@ -101,7 +108,8 @@ UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an
 half copied page since it'll keep userfaulting until the copy has
 finished.
-== QEMU/KVM ==
+QEMU/KVM
+========
 QEMU/KVM is using the userfaultfd syscall to implement postcopy live
 migration. Postcopy live migration is one form of memory
@@ -163,7 +171,8 @@ sending the same page twice (in case the userfault is read by the
 postcopy thread just before UFFDIO_COPY|ZEROPAGE runs in the migration
 thread).
-== Non-cooperative userfaultfd ==
+Non-cooperative userfaultfd
+===========================
 When the userfaultfd is monitored by an external manager, the manager
 must be able to track changes in the process virtual memory
@@ -172,27 +181,30 @@ the same read(2) protocol as for the page fault notifications. The
 manager has to explicitly enable these events by setting appropriate
 bits in uffdio_api.features passed to UFFDIO_API ioctl:
-UFFD_FEATURE_EVENT_FORK - enable userfaultfd hooks for fork(). When
+UFFD_FEATURE_EVENT_FORK
-this feature is enabled, the userfaultfd context of the parent process
+	enable userfaultfd hooks for fork(). When this feature is
-is duplicated into the newly created process. The manager receives
+	enabled, the userfaultfd context of the parent process is
-UFFD_EVENT_FORK with file descriptor of the new userfaultfd context in
+	duplicated into the newly created process. The manager
-the uffd_msg.fork.
+	receives UFFD_EVENT_FORK with file descriptor of the new
+	userfaultfd context in the uffd_msg.fork.
-UFFD_FEATURE_EVENT_REMAP - enable notifications about mremap()
-calls. When the non-cooperative process moves a virtual memory area to
+UFFD_FEATURE_EVENT_REMAP
-a different location, the manager will receive UFFD_EVENT_REMAP. The
+	enable notifications about mremap() calls. When the
-uffd_msg.remap will contain the old and new addresses of the area and
+	non-cooperative process moves a virtual memory area to a
-its original length.
+	different location, the manager will receive
+	UFFD_EVENT_REMAP. The uffd_msg.remap will contain the old and
-UFFD_FEATURE_EVENT_REMOVE - enable notifications about
+	new addresses of the area and its original length.
-madvise(MADV_REMOVE) and madvise(MADV_DONTNEED) calls. The event
-UFFD_EVENT_REMOVE will be generated upon these calls to madvise. The
+UFFD_FEATURE_EVENT_REMOVE
-uffd_msg.remove will contain start and end addresses of the removed
+	enable notifications about madvise(MADV_REMOVE) and
-area.
+	madvise(MADV_DONTNEED) calls. The event UFFD_EVENT_REMOVE will
+	be generated upon these calls to madvise. The uffd_msg.remove
-UFFD_FEATURE_EVENT_UNMAP - enable notifications about memory
+	will contain start and end addresses of the removed area.
-unmapping. The manager will get UFFD_EVENT_UNMAP with uffd_msg.remove
-containing start and end addresses of the unmapped area.
+UFFD_FEATURE_EVENT_UNMAP
+	enable notifications about memory unmapping. The manager will
+	get UFFD_EVENT_UNMAP with uffd_msg.remove containing start and
+	end addresses of the unmapped area.
 Although the UFFD_FEATURE_EVENT_REMOVE and UFFD_FEATURE_EVENT_UNMAP
 are pretty similar, they quite differ in the action expected from the

--- a/Documentation/vm/z3fold.txt
+++ b/Documentation/vm/z3fold.txt
+.. _z3fold:
+======
 z3fold
------
+======
 z3fold is a special purpose allocator for storing compressed pages.
 It is designed to store up to three compressed pages per physical page.
@@ -7,6 +10,7 @@ It is a zbud derivative which allows for higher compression
 ratio keeping the simplicity and determinism of its predecessor.
 The main differences between z3fold and zbud are:
 * unlike zbud, z3fold allows for up to PAGE_SIZE allocations
 * z3fold can hold up to 3 compressed pages in its page
 * z3fold doesn't export any API itself and is thus intended to be used

--- a/Documentation/vm/zsmalloc.txt
+++ b/Documentation/vm/zsmalloc.txt
+.. _zsmalloc:
+========
 zsmalloc
--------
+========
 This allocator is designed for use with zram. Thus, the allocator is
 supposed to work well under low memory conditions. In particular, it
@@ -31,40 +34,49 @@ be mapped using zs_map_object() to get a usable pointer and subsequently
 unmapped using zs_unmap_object().
 stat
----
+====
 With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via
-/sys/kernel/debug/zsmalloc/<user name>. Here is a sample of stat output:
+``/sys/kernel/debug/zsmalloc/<user name>``. Here is a sample of stat output::
-# cat /sys/kernel/debug/zsmalloc/zram0/classes
+ # cat /sys/kernel/debug/zsmalloc/zram0/classes
 class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage
-    ..
+    ...
-    ..
+    ...
     9   176           0            1           186        129          8                4
    10   192           1            0          2880       2872        135                3
    11   208           0            1           819        795         42                2
    12   224           0            1           219        159         12                4
-    ..
+    ...
-    ..
+    ...
+class
+	index
+size
+	object size zspage stores
+almost_empty
+	the number of ZS_ALMOST_EMPTY zspages(see below)
+almost_full
+	the number of ZS_ALMOST_FULL zspages(see below)
+obj_allocated
+	the number of objects allocated
+obj_used
+	the number of objects allocated to the user
+pages_used
+	the number of pages allocated for the class
+pages_per_zspage
+	the number of 0-order pages to make a zspage
-class: index
+We assign a zspage to ZS_ALMOST_EMPTY fullness group when n <= N / f, where
-size: object size zspage stores
-almost_empty: the number of ZS_ALMOST_EMPTY zspages(see below)
-almost_full: the number of ZS_ALMOST_FULL zspages(see below)
-obj_allocated: the number of objects allocated
-obj_used: the number of objects allocated to the user
-pages_used: the number of pages allocated for the class
-pages_per_zspage: the number of 0-order pages to make a zspage
-We assign a zspage to ZS_ALMOST_EMPTY fullness group when:
+* n = number of allocated objects
-      n <= N / f, where
+* N = total number of objects zspage can store
-n = number of allocated objects
+* f = fullness_threshold_frac(ie, 4 at the moment)
-N = total number of objects zspage can store
-f = fullness_threshold_frac(ie, 4 at the moment)
 Similarly, we assign zspage to:
-      ZS_ALMOST_FULL  when n > N / f
-      ZS_EMPTY        when n == 0
+* ZS_ALMOST_FULL  when n > N / f
-      ZS_FULL         when n == N
+* ZS_EMPTY        when n == 0
+* ZS_FULL         when n == N
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15621,7 +15621,7 @@ L:	linux-mm@kvack.org
 S:	Maintained
 F:	mm/zsmalloc.c
 F:	include/linux/zsmalloc.h
-F:	Documentation/vm/zsmalloc.txt
+F:	Documentation/vm/zsmalloc.rst
 ZSWAP COMPRESSED SWAP CACHING
 M:	Seth Jennings <sjenning@redhat.com>

--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -585,7 +585,7 @@ config ARCH_DISCONTIGMEM_ENABLE
 	  Say Y to support efficient handling of discontiguous physical memory,
 	  for architectures which are either NUMA (Non-Uniform Memory Access)
 	  or have huge holes in the physical address space for other reasons.
-	  See <file:Documentation/vm/numa> for more.
+	  See <file:Documentation/vm/numa.rst> for more.
 source "mm/Kconfig"

--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -397,7 +397,7 @@ config ARCH_DISCONTIGMEM_ENABLE
 	  Say Y to support efficient handling of discontiguous physical memory,
 	  for architectures which are either NUMA (Non-Uniform Memory Access)
 	  or have huge holes in the physical address space for other reasons.
- 	  See <file:Documentation/vm/numa> for more.
+	  See <file:Documentation/vm/numa.rst> for more.
 config ARCH_FLATMEM_ENABLE
 	def_bool y

--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2556,7 +2556,7 @@ config ARCH_DISCONTIGMEM_ENABLE
 	  Say Y to support efficient handling of discontiguous physical memory,
 	  for architectures which are either NUMA (Non-Uniform Memory Access)
 	  or have huge holes in the physical address space for other reasons.
-	  See <file:Documentation/vm/numa> for more.
+	  See <file:Documentation/vm/numa.rst> for more.
 config ARCH_SPARSEMEM_ENABLE
 	bool

--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -883,7 +883,7 @@ config PPC_MEM_KEYS
 	  page-based protections, but without requiring modification of the
 	  page tables when an application changes protection domains.
-	  For details, see Documentation/vm/protection-keys.txt
+	  For details, see Documentation/vm/protection-keys.rst
 	  If unsure, say y.

--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -196,7 +196,7 @@ config HUGETLBFS
 	help
 	  hugetlbfs is a filesystem backing for HugeTLB pages, based on
 	  ramfs. For architectures that support it, say Y here and read
-	  <file:Documentation/vm/hugetlbpage.txt> for details.
+	  <file:Documentation/vm/hugetlbpage.rst> for details.
 	  If unsure, say N.

--- a/fs/dax.c
+++ b/fs/dax.c
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
--- a/mm/Kconfig
+++ b/mm/Kconfig
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
--- a/mm/hmm.c
+++ b/mm/hmm.c
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
--- a/mm/ksm.c
+++ b/mm/ksm.c
--- a/mm/mmap.c
+++ b/mm/mmap.c
--- a/mm/rmap.c
+++ b/mm/rmap.c
--- a/mm/util.c
+++ b/mm/util.c