- 27 5月, 2015 3 次提交
-
-
由 Tejun Heo 提交于
Now that threadgroup locking is made global, code paths around it can be simplified. * lock-verify-unlock-retry dancing removed from __cgroup_procs_write(). * Race protection against de_thread() removed from cgroup_update_dfl_csses(). Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
The cgroup side of threadgroup locking uses signal_struct->group_rwsem to synchronize against threadgroup changes. This per-process rwsem adds small overhead to thread creation, exit and exec paths, forces cgroup code paths to do lock-verify-unlock-retry dance in a couple places and makes it impossible to atomically perform operations across multiple processes. This patch replaces signal_struct->group_rwsem with a global percpu_rwsem cgroup_threadgroup_rwsem which is cheaper on the reader side and contained in cgroups proper. This patch converts one-to-one. This does make writer side heavier and lower the granularity; however, cgroup process migration is a fairly cold path, we do want to optimize thread operations over it and cgroup migration operations don't take enough time for the lower granularity to matter. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org>
-
由 Tejun Heo 提交于
threadgroup_change_begin/end() are used to mark the beginning and end of threadgroup modifying operations to allow code paths which require a threadgroup to stay stable across blocking operations to synchronize against those sections using threadgroup_lock/unlock(). It's currently implemented as a general mechanism in sched.h using per-signal_struct rwsem; however, this never grew non-cgroup use cases and becomes noop if !CONFIG_CGROUPS. It turns out that cgroups is gonna be better served with a different sycnrhonization scheme and is a bit silly to keep cgroups specific details as a general mechanism. What's general here is identifying the places where threadgroups are modified. This patch restructures threadgroup locking so that threadgroup_change_begin/end() become a place where subsystems which need to sycnhronize against threadgroup changes can hook into. cgroup_threadgroup_change_begin/end() which operate on the per-signal_struct rwsem are created and threadgroup_lock/unlock() are moved to cgroup.c and made static. This is pure reorganization which doesn't cause any functional changes. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 19 5月, 2015 1 次提交
-
-
由 Aleksa Sarai 提交于
Switch the type of all internal cgroup masks to (unsigned long), which is the correct type for bitmasks. This is in preparation for the for_each_subsys_which patch. Signed-off-by: NAleksa Sarai <cyphar@cyphar.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 23 4月, 2015 1 次提交
-
-
由 Chen Hanxiao 提交于
s/effctive/effective s/hierarhcy/hierarchy s/shoulid/should Signed-off-by: NChen Hanxiao <chenhanxiao@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 16 4月, 2015 2 次提交
-
-
由 Joe Perches 提交于
The seq_printf return value, because it's frequently misused, will eventually be converted to void. See: commit 1f33c41c ("seq_file: Rename seq_overflow() to seq_has_overflowed() and make public") Signed-off-by: NJoe Perches <joe@perches.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
mem_cgroup_lookup() is a wrapper around mem_cgroup_from_id(), which checks that id != 0 before issuing the function call. Today, there is no point in this additional check apart from optimization, because there is no css with id <= 0, so that css_from_id, called by mem_cgroup_from_id, will return NULL for any id <= 0. Since mem_cgroup_from_id is only called from mem_cgroup_lookup, let us zap mem_cgroup_lookup, substituting calls to it with mem_cgroup_from_id and moving the check if id > 0 to css_from_id. Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 3月, 2015 2 次提交
-
-
由 Bandan Das 提交于
The wrapper already calls the appropriate free function, use it instead of spinning our own. Signed-off-by: NBandan Das <bsd@redhat.com> Acked-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Vladimir Davydov 提交于
Currently, we call cgroup_subsys->bind only on unmount, remount, and when creating a new root on mount. Since the default hierarchy root is created in cgroup_init, we will not call cgroup_subsys->bind if the default hierarchy is freshly mounted. As a result, some controllers will behave incorrectly (most notably, the "memory" controller will not enable hierarchy support). Fix this by calling cgroup_subsys->bind right after initializing a cgroup subsystem. Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 14 2月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
When a new kernfs node is created, KERNFS_STATIC_NAME is used to avoid making a separate copy of its name. It's currently only used for sysfs attributes whose filenames are required to stay accessible and unchanged. There are rare exceptions where these names are allocated and formatted dynamically but for the vast majority of cases they're consts in the rodata section. Now that kernfs is converted to use kstrdup_const() and kfree_const(), there's little point in keeping KERNFS_STATIC_NAME around. Remove it. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 2月, 2015 1 次提交
-
-
由 Vladimir Davydov 提交于
Currently, we release css->id in css_release_work_fn, right before calling css_free callback, so that when css_free is called, the id may have already been reused for a new cgroup. I am going to use css->id to create unique names for per memcg kmem caches. Since kmem caches are destroyed only on css_free, I need css->id to be freed after css_free was called to avoid name clashes. This patch therefore moves css->id removal to css_free_work_fn. To prevent css_from_id from returning a pointer to a stale css, it makes css_release_work_fn replace the css ptr at css_idr:css->id with NULL. Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Acked-by: NTejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 1月, 2015 1 次提交
-
-
由 Johannes Weiner 提交于
Since b2052564 ("mm: memcontrol: continue cache reclaim from offlined groups"), re-mounting the memory controller after using it is very likely to hang. The cgroup core assumes that any remaining references after deleting a cgroup are temporary in nature, and synchroneously waits for them, but the above-mentioned commit has left-over page cache pin its css until it is reclaimed naturally. That being said, swap entries and charged kernel memory have been doing the same indefinite pinning forever, the bug is just more likely to trigger with left-over page cache. Reparenting kernel memory is highly impractical, which leaves changing the cgroup assumptions to reflect this: once a controller has been mounted and used, it has internal state that is independent from mount and cgroup lifetime. It can be unmounted and remounted, but it can't be reconfigured during subsequent mounts. Don't offline the controller root as long as there are any children, dead or alive. A remount will no longer wait for these old references to drain, it will simply mount the persistent controller state again. Reported-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com> Reported-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 18 11月, 2014 6 次提交
-
-
由 Tejun Heo 提交于
Implement cgroup_get_e_css() which finds and gets the effective css for the specified cgroup and subsystem combination. This function always returns a valid pinned css. This will be used by cgroup writeback support. While at it, add comment to cgroup_e_css() to explain why that function is different from cgroup_get_e_css() and has to test cgrp->child_subsys_mask instead of cgroup_css(cgrp, ss). Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NZefan Li <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Add a new cgroup_subsys operatoin ->css_e_css_changed(). This is invoked if any of the effective csses seen from the css's cgroup may have changed. This will be used to implement cgroup writeback support. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NZefan Li <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Add a new cgroup subsys callback css_released(). This is called when the reference count of the css (cgroup_subsys_state) reaches zero before RCU scheduling free. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NZefan Li <lizefan@huawei.com>
-
由 Tejun Heo 提交于
When a subsystem is offlined, its entry on @cgrp->subsys[] is cleared asynchronously. If cgroup_subtree_control_write() is requested to enable the subsystem again before the entry is cleared, it has to wait for the previous offlining to finish and clear the @cgrp->subsys[] entry before trying to enable the subsystem again. This is currently done while verifying the input enable / disable parameters. This used to be correct but f63070d3 ("cgroup: make interface files visible iff enabled on cgroup->subtree_control") breaks it. The commit is one of the commits implementing subsystem dependency. Through subsystem dependency, some subsystems may be enabled and disabled implicitly in addition to the explicitly requested ones. The actual subsystems to be enabled and disabled are determined during @css_enable/disable calculation. The current offline wait logic skips the ones which are already implicitly enabled and then waits for subsystems in @enable; however, this misses the subsystems which may be implicitly enabled through dependency from @enable. If such implicitly subsystem hasn't yet finished offlining yet, the function ends up trying to create a css when its @cgrp->subsys[] slot is already occupied triggering BUG_ON() in init_and_link_css(). Fix it by moving the wait logic after @css_enable is calculated and waiting for all the subsystems in @css_enable. This fixes the above bug as the mask contains all subsystems which are to be enabled including the ones enabled through dependencies. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: f63070d3 ("cgroup: make interface files visible iff enabled on cgroup->subtree_control") Acked-by: NZefan Li <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Make cgroup_subtree_control_write() first calculate new subtree_control (new_sc), child_subsys_mask (new_ss) and css_enable/disable masks before applying them to the cgroup. Also, store the original subtree_control (old_sc) and child_subsys_mask (old_ss) and use them to restore the orignal state after failure. This patch shouldn't cause any behavior changes. This prepares for a fix for a bug in the async css offline wait logic. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NZefan Li <lizefan@huawei.com>
-
由 Tejun Heo 提交于
cgroup_refresh_child_subsys_mask() calculates and updates the effective @cgrp->child_subsys_maks according to the current @cgrp->subtree_control. Separate out the calculation part into cgroup_calc_child_subsys_mask(). This will be used to fix a bug in the async css offline wait logic. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NZefan Li <lizefan@huawei.com>
-
- 26 9月, 2014 1 次提交
-
-
由 Zefan Li 提交于
This reverts commit 0c7bf3e8. If there are child cgroups in the cgroupfs and then we umount it, the superblock will be destroyed but the cgroup_root will be kept around. When we mount it again, cgroup_mount() will find this cgroup_root and allocate a new sb for it. So with this commit we will be trapped in a dead loop in the case described above, because kernfs_pin_sb() keeps returning NULL. Currently I don't see how we can avoid using both pinned_sb and new_sb, so just revert it. Cc: Al Viro <viro@ZenIV.linux.org.uk> Reported-by: NAndrey Wagin <avagin@gmail.com> Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 25 9月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
With the recent addition of percpu_ref_reinit(), percpu_ref now can be used as a persistent switch which can be turned on and off repeatedly where turning off maps to killing the ref and waiting for it to drain; however, there currently isn't a way to initialize a percpu_ref in its off (killed and drained) state, which can be inconvenient for certain persistent switch use cases. Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic selection of operation mode; however, currently a newly initialized percpu_ref is always in percpu mode making it impossible to avoid the latency overhead of switching to atomic mode. This patch adds @flags to percpu_ref_init() and implements the following flags. * PERCPU_REF_INIT_ATOMIC : start ref in atomic mode * PERCPU_REF_INIT_DEAD : start ref killed and drained These flags should be able to serve the above two use cases. v2: target_core_tpg.c conversion was missing. Fixed. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Johannes Weiner <hannes@cmpxchg.org>
-
- 21 9月, 2014 2 次提交
-
-
由 Zefan Li 提交于
Both pinned_sb and new_sb indicate if a new superblock is needed, so we can just remove new_sb. Note now we must check if kernfs_tryget_sb() returns NULL, because when it returns NULL, kernfs_mount() may still re-use an existing superblock, which is just allocated by another concurent mount. Suggested-by: NTejun Heo <tj@kernel.org> Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Zefan Li 提交于
The patch 971ff493: "cgroup: use a per-cgroup work for release agent" from Sep 18, 2014, leads to the following static checker warning: kernel/cgroup.c:5310 cgroup_release_agent() warn: 'mutex:&cgroup_mutex' is sometimes locked here and sometimes unlocked. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 19 9月, 2014 4 次提交
-
-
由 Zefan Li 提交于
We call put_css_set() after setting CGRP_RELEASABLE flag in cgroup_task_migrate(), but in other places we call it without setting the flag. I don't see the necessity of this flag. Moreover once the flag is set, it will never be cleared, unless writing to the notify_on_release control file, so it can be quite confusing if we look at the output of debug.releasable. # mount -t cgroup -o debug xxx /cgroup # mkdir /cgroup/child # cat /cgroup/child/debug.releasable 0 <-- shows 0 though the cgroup is empty # echo $$ > /cgroup/child/tasks # cat /cgroup/child/debug.releasable 0 # echo $$ > /cgroup/tasks && echo $$ > /cgroup/child/tasks # cat /proc/child/debug.releasable 1 <-- shows 1 though the cgroup is not empty This patch removes the flag, and now debug.releasable shows if the cgroup is empty or not. Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Zefan Li 提交于
Use the ONE macro instead of REG, and we can simplify proc_cgroup_show(). Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Zefan Li 提交于
Instead of using a global work to schedule release agent on removable cgroups, we change to use a per-cgroup work to do this, which makes the code much simpler. v2: use a dedicated work instead of reusing css->destroy_work. (Tejun) Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Zefan Li 提交于
cgroup_pidlist_start() holds cgrp->pidlist_mutex and then calls pidlist_array_load(), and cgroup_pidlist_stop() releases the mutex. It is wrong that we release the mutex in the failure path in pidlist_array_load(), because cgroup_pidlist_stop() will be called no matter if cgroup_pidlist_start() returns errno or not. Fixes: 4bac00d1 Cc: <stable@vger.kernel.org> # 3.14+ Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
-
- 18 9月, 2014 3 次提交
-
-
由 Li Zefan 提交于
We never grab cgroup mutex in fork and exit paths no matter whether notify_on_release is set or not. Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Li Zefan 提交于
We no longer clear kn->priv in cgroup_rmdir(), so we don't need to get an extra refcnt. Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Li Zefan 提交于
Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 08 9月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
Percpu allocator now supports allocation mask. Add @gfp to percpu_ref_init() so that !GFP_KERNEL allocation masks can be used with percpu_refs too. This patch doesn't make any functional difference. v2: blk-mq conversion was missing. Updated. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Kent Overstreet <koverstreet@google.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Cc: Jens Axboe <axboe@kernel.dk>
-
- 05 9月, 2014 2 次提交
-
-
由 Li Zefan 提交于
When cgroup_kn_lock_live() is called through some kernfs operation and another thread is calling cgroup_rmdir(), we'll trigger the warning in cgroup_get(). ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1228 at kernel/cgroup.c:1034 cgroup_get+0x89/0xa0() ... Call Trace: [<c16ee73d>] dump_stack+0x41/0x52 [<c10468ef>] warn_slowpath_common+0x7f/0xa0 [<c104692d>] warn_slowpath_null+0x1d/0x20 [<c10bb999>] cgroup_get+0x89/0xa0 [<c10bbe58>] cgroup_kn_lock_live+0x28/0x70 [<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230 [<c10be5b2>] cgroup_tasks_write+0x12/0x20 [<c10bb7b0>] cgroup_file_write+0x40/0x130 [<c11aee71>] kernfs_fop_write+0xd1/0x160 [<c1148e58>] vfs_write+0x98/0x1e0 [<c114934d>] SyS_write+0x4d/0xa0 [<c16f656b>] sysenter_do_call+0x12/0x12 ---[ end trace 6f2e0c38c2108a74 ]--- Fix this by calling css_tryget() instead of cgroup_get(). v2: - move cgroup_tryget() right below cgroup_get() definition. (Tejun) Cc: <stable@vger.kernel.org> # 3.15+ Reported-by: NToralf Förster <toralf.foerster@gmx.de> Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Li Zefan 提交于
Run these two scripts concurrently: for ((; ;)) { mkdir /cgroup/sub rmdir /cgroup/sub } for ((; ;)) { echo $$ > /cgroup/sub/cgroup.procs echo $$ > /cgroup/cgroup.procs } A kernel bug will be triggered: BUG: unable to handle kernel NULL pointer dereference at 00000038 IP: [<c10bbd69>] cgroup_put+0x9/0x80 ... Call Trace: [<c10bbe19>] cgroup_kn_unlock+0x39/0x50 [<c10bbe91>] cgroup_kn_lock_live+0x61/0x70 [<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230 [<c10be5b2>] cgroup_tasks_write+0x12/0x20 [<c10bb7b0>] cgroup_file_write+0x40/0x130 [<c11aee71>] kernfs_fop_write+0xd1/0x160 [<c1148e58>] vfs_write+0x98/0x1e0 [<c114934d>] SyS_write+0x4d/0xa0 [<c16f656b>] sysenter_do_call+0x12/0x12 We clear cgrp->kn->priv in the end of cgroup_rmdir(), but another concurrent thread can access kn->priv after the clearing. We should move the clearing to css_release_work_fn(). At that time no one is holding reference to the cgroup and no one can gain a new reference to access it. v2: - move RCU_INIT_POINTER() into the else block. (Tejun) - remove the cgroup_parent() check. (Tejun) - update the comment in css_tryget_online_from_dir(). Cc: <stable@vger.kernel.org> # 3.15+ Reported-by: NToralf Förster <toralf.foerster@gmx.de> Signed-off-by: NZefan Li <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 25 8月, 2014 1 次提交
-
-
由 Dongsheng Yang 提交于
There is no function named cgroup_enable_task_cg_links(). Instead, the correct function name in this comment should be cgroup_enabled_task_cg_lists(). Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 23 8月, 2014 1 次提交
-
-
由 Vivek Goyal 提交于
Kernel command line parameter cgroup__DEVEL__legacy_files_on_dfl forces legacy cgroup files to show up on default hierarhcy if susbsystem does not have any files defined for default hierarchy. But this seems to be working only if legacy files are defined in ss->legacy_cftypes. If one adds some cftypes later using cgroup_add_legacy_cftypes(), these files don't show up on default hierarchy. Update the function accordingly so that the dynamically added legacy files also show up in the default hierarchy if the target subsystem is also using the base legacy files for the default hierarchy. tj: Patch description and comment updates. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 18 8月, 2014 1 次提交
-
-
由 Alban Crequy 提交于
/proc/<pid>/cgroup contains one cgroup path on each line. If cgroup names are allowed to contain "\n", applications cannot parse /proc/<pid>/cgroup safely. Signed-off-by: NAlban Crequy <alban.crequy@collabora.co.uk> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org
-
- 15 7月, 2014 5 次提交
-
-
由 Tejun Heo 提交于
cgrp_dfl_root_inhibit_ss_mask determines which subsystems are not supported on the default hierarchy and is currently initialized statically and just includes the debug subsystem. Now that there's cgroup_subsys->dfl_files, we can easily tell which subsystems support the default hierarchy or not. Let's initialize cgrp_dfl_root_inhibit_ss_mask by testing whether cgroup_subsys->dfl_files is NULL. After all, subsystems with NULL ->dfl_files aren't useable on the default hierarchy anyway. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
cgroup now distinguishes cftypes for the default and legacy hierarchies more explicitly by using separate arrays and CFTYPE_ONLY_ON_DFL and CFTYPE_INSANE should be and are used only inside cgroup core proper. Let's make it clear that the flags are internal by prefixing them with double underscores. CFTYPE_INSANE is renamed to __CFTYPE_NOT_ON_DFL for consistency. The two flags are also collected and assigned bits >= 16 so that they aren't mixed with the published flags. v2: Convert the extra ones in cgroup_exit_cftypes() which are added by revision to the previous patch. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Until now, cftype arrays carried files for both the default and legacy hierarchies and the files which needed to be used on only one of them were flagged with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE. This gets confusing very quickly and we may end up exposing interface files to the default hierarchy without thinking it through. This patch makes cgroup core provide separate sets of interfaces for cftype handling so that the cftypes for the default and legacy hierarchies are clearly distinguished. The previous two patches renamed the existing ones so that they clearly indicate that they're for the legacy hierarchies. This patch adds the interface for the default hierarchy and apply them selectively depending on the hierarchy type. * cftypes added through cgroup_subsys->dfl_cftypes and cgroup_add_dfl_cftypes() only show up on the default hierarchy. * cftypes added through cgroup_subsys->legacy_cftypes and cgroup_add_legacy_cftypes() only show up on the legacy hierarchies. * cgroup_subsys->dfl_cftypes and ->legacy_cftypes can point to the same array for the cases where the interface files are identical on both types of hierarchies. * This makes all the existing subsystem interface files legacy-only by default and all subsystems will have no interface file created when enabled on the default hierarchy. Each subsystem should explicitly review and compose the interface for the default hierarchy. * A boot param "cgroup__DEVEL__legacy_files_on_dfl" is added which makes subsystems which haven't decided the interface files for the default hierarchy to present the legacy files on the default hierarchy so that its behavior on the default hierarchy can be tested. As the awkward name suggests, this is for development only. * memcg's CFTYPE_INSANE on "use_hierarchy" is noop now as the whole array isn't used on the default hierarchy. The flag is removed. v2: Updated documentation for cgroup__DEVEL__legacy_files_on_dfl. v3: Clear CFTYPE_ONLY_ON_DFL and CFTYPE_INSANE when cfts are removed as suggested by Li. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Tejun Heo 提交于
Currently, cftypes added by cgroup_add_cftypes() are used for both the unified default hierarchy and legacy ones and subsystems can mark each file with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear only on one of them. This is quite hairy and error-prone. Also, we may end up exposing interface files to the default hierarchy without thinking it through. cgroup_subsys will grow two separate cftype addition functions and apply each only on the hierarchies of the matching type. This will allow organizing cftypes in a lot clearer way and encourage subsystems to scrutinize the interface which is being exposed in the new default hierarchy. In preparation, this patch adds cgroup_add_legacy_cftypes() which currently is a simple wrapper around cgroup_add_cftypes() and replaces all cgroup_add_cftypes() usages with it. While at it, this patch drops a completely spurious return from __hugetlb_cgroup_file_init(). This patch doesn't introduce any functional differences. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Tejun Heo 提交于
Currently, cgroup_subsys->base_cftypes is used for both the unified default hierarchy and legacy ones and subsystems can mark each file with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear only on one of them. This is quite hairy and error-prone. Also, we may end up exposing interface files to the default hierarchy without thinking it through. cgroup_subsys will grow two separate cftype arrays and apply each only on the hierarchies of the matching type. This will allow organizing cftypes in a lot clearer way and encourage subsystems to scrutinize the interface which is being exposed in the new default hierarchy. In preparation, this patch renames cgroup_subsys->base_cftypes to cgroup_subsys->legacy_cftypes. This patch is pure rename. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-