cgroup: replace unified-hierarchy.txt with a proper cgroup v2 documentation

Now that cgroup v2 is almost out of the door, replace the development documentation unified-hierarchy.txt with Documentation/cgroup.txt which is a superset of unified-hierarchy.txt and authoritatively describes all userland-visible aspects of cgroup. v2: Updated to include all information from blkio-controller.txt and list filesystems which support cgroup writeback as suggested by Vivek. Signed-off-by: N Tejun Heo <tj@kernel.org> Acked-by: N Li Zefan <lizefan@huawei.com> Cc: Vivek Goyal <vgoyal@redhat.com>

cgroup: replace unified-hierarchy.txt with a proper cgroup v2 documentation
Now that cgroup v2 is almost out of the door, replace the development documentation unified-hierarchy.txt with Documentation/cgroup.txt which is a superset of unified-hierarchy.txt and authoritatively describes all userland-visible aspects of cgroup. v2: Updated to include all information from blkio-controller.txt and list filesystems which support cgroup writeback as suggested by Vivek. Signed-off-by: N Tejun Heo <tj@kernel.org> Acked-by: N Li Zefan <lizefan@huawei.com> Cc: Vivek Goyal <vgoyal@redhat.com>
6c292092 · Tejun Heo · 0d942766 · 6c292092 · 0d942766 · 6c292092
3 changed file
--- a/Documentation/cgroup-legacy/blkio-controller.txt
+++ b/Documentation/cgroup-legacy/blkio-controller.txt
@@ -374,82 +374,3 @@ One can experience an overall throughput drop if you have created multiple
 groups and put applications in that group which are not driving enough
 IO to keep disk busy. In that case set group_idle=0, and CFQ will not idle
 on individual groups and throughput should improve.
-
-Writeback
-=========
-
-Page cache is dirtied through buffered writes and shared mmaps and
-written asynchronously to the backing filesystem by the writeback
-mechanism.  Writeback sits between the memory and IO domains and
-regulates the proportion of dirty memory by balancing dirtying and
-write IOs.
-
-On traditional cgroup hierarchies, relationships between different
-controllers cannot be established making it impossible for writeback
-to operate accounting for cgroup resource restrictions and all
-writeback IOs are attributed to the root cgroup.
-
-If both the blkio and memory controllers are used on the v2 hierarchy
-and the filesystem supports cgroup writeback, writeback operations
-correctly follow the resource restrictions imposed by both memory and
-blkio controllers.
-
-Writeback examines both system-wide and per-cgroup dirty memory status
-and enforces the more restrictive of the two.  Also, writeback control
-parameters which are absolute values - vm.dirty_bytes and
-vm.dirty_background_bytes - are distributed across cgroups according
-to their current writeback bandwidth.
-
-There's a peculiarity stemming from the discrepancy in ownership
-granularity between memory controller and writeback.  While memory
-controller tracks ownership per page, writeback operates on inode
-basis.  cgroup writeback bridges the gap by tracking ownership by
-inode but migrating ownership if too many foreign pages, pages which
-don't match the current inode ownership, have been encountered while
-writing back the inode.
-
-This is a conscious design choice as writeback operations are
-inherently tied to inodes making strictly following page ownership
-complicated and inefficient.  The only use case which suffers from
-this compromise is multiple cgroups concurrently dirtying disjoint
-regions of the same inode, which is an unlikely use case and decided
-to be unsupported.  Note that as memory controller assigns page
-ownership on the first use and doesn't update it until the page is
-released, even if cgroup writeback strictly follows page ownership,
-multiple cgroups dirtying overlapping areas wouldn't work as expected.
-In general, write-sharing an inode across multiple cgroups is not well
-supported.
-
-Filesystem support for cgroup writeback
---------------------------------------
-
-A filesystem can make writeback IOs cgroup-aware by updating
-address_space_operations->writepage[s]() to annotate bio's using the
-following two functions.
-
-* wbc_init_bio(@wbc, @bio)
-
-  Should be called for each bio carrying writeback data and associates
-  the bio with the inode's owner cgroup.  Can be called anytime
-  between bio allocation and submission.
-
-* wbc_account_io(@wbc, @page, @bytes)
-
-  Should be called for each data segment being written out.  While
-  this function doesn't care exactly when it's called during the
-  writeback session, it's the easiest and most natural to call it as
-  data segments are added to a bio.
-
-With writeback bio's annotated, cgroup support can be enabled per
-super_block by setting MS_CGROUPWB in ->s_flags.  This allows for
-selective disabling of cgroup writeback support which is helpful when
-certain filesystem features, e.g. journaled data mode, are
-incompatible.
-
-wbc_init_bio() binds the specified bio to its cgroup.  Depending on
-the configuration, the bio may be executed at a lower priority and if
-the writeback session is holding shared resources, e.g. a journal
-entry, may lead to priority inversion.  There is no one easy solution
-for the problem.  Filesystems can try to work around specific problem
-cases by skipping wbc_init_bio() or using bio_associate_blkcg()
-directly.
--- a/Documentation/cgroup-legacy/unified-hierarchy.txt
+++ b/Documentation/cgroup-legacy/unified-hierarchy.txt
--- a/Documentation/cgroup.txt
+++ b/Documentation/cgroup.txt