- 26 4月, 2014 5 次提交
-
-
由 Joe Perches 提交于
Use pr_fmt and remove embedded prefixes. Realign modified multi-line statements to open parenthesis. Convert embedded function name to "%s: ", __func__ Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Jianyu Zhan 提交于
As suggested by scripts/checkpatch.pl, substitude all pr_warning() with pr_warn(). No functional change. Signed-off-by: NJianyu Zhan <nasa4836@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Jianyu Zhan 提交于
6612f05b ("cgroup: unify pidlist and other file handling") has removed the only user of cgroup_pidlist_seq_operations : cgroup_pidlist_open(). This patch removes it. Signed-off-by: NJianyu Zhan <nasa4836@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Jianyu Zhan 提交于
1d5be6b2 ("cgroup: move module ref handling into rebind_subsystems()") makes parse_cgroupfs_options() no longer takes refcounts on subsystems. And unified hierachy makes parse_cgroupfs_options not need to call with cgroup_mutex held to protect the cgroup_subsys[]. So this patch removes BUG_ON() and the comment. As the comment doesn't contain useful information afterwards, the whole comment is removed. Signed-off-by: NJianyu Zhan <nasa4836@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
cgroup users often need a way to determine when a cgroup's subhierarchy becomes empty so that it can be cleaned up. cgroup currently provides release_agent for it; unfortunately, this mechanism is riddled with issues. * It delivers events by forking and execing a userland binary specified as the release_agent. This is a long deprecated method of notification delivery. It's extremely heavy, slow and cumbersome to integrate with larger infrastructure. * There is single monitoring point at the root. There's no way to delegate management of a subtree. * The event isn't recursive. It triggers when a cgroup doesn't have any tasks or child cgroups. Events for internal nodes trigger only after all children are removed. This again makes it impossible to delegate management of a subtree. * Events are filtered from the kernel side. "notify_on_release" file is used to subscribe to or suppress release event. This is unnecessarily complicated and probably done this way because event delivery itself was expensive. This patch implements interface file "cgroup.populated" which can be used to monitor whether the cgroup's subhierarchy has tasks in it or not. Its value is 0 if there is no task in the cgroup and its descendants; otherwise, 1, and kernfs_notify() notificaiton is triggers when the value changes, which can be monitored through poll and [di]notify. This is a lot ligther and simpler and trivially allows delegating management of subhierarchy - subhierarchy monitoring can block further propgation simply by putting itself or another process in the root of the subhierarchy and monitor events that it's interested in from there without interfering with monitoring higher in the tree. v2: Patch description updated as per Serge. v3: "cgroup.subtree_populated" renamed to "cgroup.populated". The subtree_ prefix was a bit confusing because "cgroup.subtree_control" uses it to denote the tree rooted at the cgroup sans the cgroup itself while the populated state includes the cgroup itself. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NSerge Hallyn <serge.hallyn@ubuntu.com> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Lennart Poettering <lennart@poettering.net>
-
- 23 4月, 2014 13 次提交
-
-
由 Tejun Heo 提交于
cgroup is switching away from multiple hierarchies and will use one unified default hierarchy where controllers can be dynamically enabled and disabled per subtree. The default hierarchy will serve as the unified hierarchy to which all controllers are attached and a css on the default hierarchy would need to also serve the tasks of descendant cgroups which don't have the controller enabled - ie. the tree may be collapsed from leaf towards root when viewed from specific controllers. This has been implemented through effective css in the previous patches. This patch finally implements dynamic subtree controller enable/disable on the default hierarchy via a new knob - "cgroup.subtree_control" which controls which controllers are enabled on the child cgroups. Let's assume a hierarchy like the following. root - A - B - C \ D root's "cgroup.subtree_control" determines which controllers are enabled on A. A's on B. B's on C and D. This coincides with the fact that controllers on the immediate sub-level are used to distribute the resources of the parent. In fact, it's natural to assume that resource control knobs of a child belong to its parent. Enabling a controller in "cgroup.subtree_control" declares that distribution of the respective resources of the cgroup will be controlled. Note that this means that controller enable states are shared among siblings. The default hierarchy has an extra restriction - only cgroups which don't contain any task may have controllers enabled in "cgroup.subtree_control". Combined with the other properties of the default hierarchy, this guarantees that, from the view point of controllers, tasks are only on the leaf cgroups. In other words, only leaf csses may contain tasks. This rules out situations where child cgroups compete against internal tasks of the parent, which is a competition between two different types of entities without any clear way to determine resource distribution between the two. Different controllers handle it differently and all the implemented behaviors are ambiguous, ad-hoc, cumbersome and/or just wrong. Having this structural constraints imposed from cgroup core removes the burden from controller implementations and enables showing one consistent behavior across all controllers. When a controller is enabled or disabled, css associations for the controller in the subtrees of each child should be updated. After enabling, the whole subtree of a child should point to the new css of the child. After disabling, the whole subtree of a child should point to the cgroup's css. This is implemented by first updating cgroup states such that cgroup_e_css() result points to the appropriate css and then invoking cgroup_update_dfl_csses() which migrates all tasks in the affected subtrees to the self cgroup on the default hierarchy. * When read, "cgroup.subtree_control" lists all the currently enabled controllers on the children of the cgroup. * White-space separated list of controller names prefixed with either '+' or '-' can be written to "cgroup.subtree_control". The ones prefixed with '+' are enabled on the controller and '-' disabled. * A controller can be enabled iff the parent's "cgroup.subtree_control" enables it and disabled iff no child's "cgroup.subtree_control" has it enabled. * If a cgroup has tasks, no controller can be enabled via "cgroup.subtree_control". Likewise, if "cgroup.subtree_control" has some controllers enabled, tasks can't be migrated into the cgroup. * All controllers which aren't bound on other hierarchies are automatically associated with the root cgroup of the default hierarchy. All the controllers which are bound to the default hierarchy are listed in the read-only file "cgroup.controllers" in the root directory. * "cgroup.controllers" in all non-root cgroups is read-only file whose content is equal to that of "cgroup.subtree_control" of the parent. This indicates which controllers can be used in the cgroup's "cgroup.subtree_control". This is still experimental and there are some holes, one of which is that ->can_attach() failure during cgroup_update_dfl_csses() may leave the cgroups in an undefined state. The issues will be addressed by future patches. v2: Non-root cgroups now also have "cgroup.controllers". Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Unified hierarchy implementation would require re-migrating tasks onto the same cgroup on the default hierarchy to reflect updated effective csses. Update cgroup_migrate_prepare_dst() so that it accepts NULL as the destination cgrp. When NULL is specified, the destination is considered to be the cgroup on the default hierarchy associated with each css_set. After this change, the identity check in cgroup_migrate_add_src() isn't sufficient for noop detection as the associated csses may change without any cgroup association changing. The only way to tell whether a migration is noop or not is testing whether the source and destination csets are identical. The noop check in cgroup_migrate_add_src() is removed and cset identity test is added to cgroup_migreate_prepare_dst(). If it's detected that source and destination csets are identical, the cset is removed removed from @preloaded_csets and all the migration nodes are cleared which makes cgroup_migrate() ignore the cset. Also, make the function append the destination css_sets to @preloaded_list so that destination css_sets always come after source css_sets. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Because the default root couldn't have any non-root csses attached to it, rebinding away from it was always allowed; however, the default hierarchy will soon host the unified hierarchy and have non-root csses so the rebind restrictions need to be updated accordingly. Instead of special casing rebinding from the default hierarchy and then checking whether the source hierarchy has children cgroups, which implies non-root csses for !dfl hierarchies, simply check whether the source hierarchy has non-root csses for the subsystem using css_next_child(). Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
To implement the unified hierarchy behavior, we'll need to be able to determine the associated cgroup on the default hierarchy from css_set. Let's add css_set->dfl_cgrp so that it can be accessed conveniently and efficiently. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Now that effective css handling has been added and iterators updated accordingly, it's safe to allow cgroup creation in the default hierarchy. Unblock cgroup creation in the default hierarchy. As the default hierarchy will implement explicit enabling and disabling of controllers on each cgroup, suppress automatic css enabling on cgroup creation. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
After a css finishes offlining, offline_css() mistakenly performs RCU_INIT_POINTER(css->cgroup->subsys[ss->id], css) which just sets the cgroup->subsys[] pointer to the current value. The intention was to clear it after offline is complete, not reassign the same value. Update it to assign NULL instead of the current value. This makes cgroup_css() to return NULL once offline is complete. All the existing users of the function either can handle NULL return already or guarantee that the css doesn't get offlined. While this is a bugfix, as css lifetime is currently tied to the cgroup it belongs to, this bug doesn't cause any actual problems. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Currently, css_task_iter iterates tasks associated with a css by visiting each css_set associated with the owning cgroup and walking tasks of each of them. This works fine for !unified hierarchies as each cgroup has its own css for each associated subsystem on the hierarchy; however, on the planned unified hierarchy, a cgroup may not have csses associated and its tasks would be considered associated with the matching css of the nearest ancestor which has the subsystem enabled. This means that on the default unified hierarchy, just walking all tasks associated with a cgroup isn't enough to walk all tasks which are associated with the specified css. If any of its children doesn't have the matching css enabled, task iteration should also include all tasks from the subtree. We already added cgroup->e_csets[] to list all css_sets effectively associated with a given css and walk css_sets on that list instead to achieve such iteration. This patch updates css_task_iter iteration such that it walks css_sets on cgroup->e_csets[] instead of cgroup->cset_links if iteration is requested on an non-dummy css. Thanks to the previous iteration update, this change can be achieved with the addition of css_task_iter->ss and minimal updates to css_advance_task_iter() and css_task_iter_start(). Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
This patch reorganizes css_task_iter so that adding effective css support is easier. * s/->cset_link/->cset_pos/ and s/->task/->task_pos/ for consistency * ->origin_css is used to determine whether the iteration reached the last css_set. Replace it with explicit ->cset_head so that css_advance_task_iter() doesn't have to know the termination condition directly. * css_task_iter_next() currently assumes that it's walking list of cgrp_cset_link and reaches into the current cset through the current link to determine the termination conditions for task walking. As this won't always be true for effective css walking, add ->tasks_head and ->mg_tasks_head and use them to control task walking so that css_task_iter_next() doesn't have to know how css_sets are being walked. This patch doesn't make any behavior changes. The iteration logic stays unchanged after the patch. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
css_next_child() walks the children of the specified css. It does this by finding the next cgroup and then returning the requested css. On the default unified hierarchy, a cgroup may not have a css associated with it even if the hierarchy has the subsystem enabled. This patch updates css_next_child() so that it skips children without the requested css associated. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
On the default unified hierarchy, a cgroup may be associated with csses of its ancestors, which means that a css of a given cgroup may be associated with css_sets of descendant cgroups. This means that we can't walk all tasks associated with a css by iterating the css_sets associated with the cgroup as there are css_sets which are pointing to the css but linked on the descendants. This patch adds per-subsystem list heads cgroup->e_csets[]. Any css_set which is pointing to a css is linked to css->cgroup->e_csets[$SUBSYS_ID] through css_set->e_cset_node[$SUBSYS_ID]. The lists are protected by css_set_rwsem and will allow us to walk all css_sets associated with a given css so that we can find out all associated tasks. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
In the planned default unified hierarchy, controllers may get dynamically attached to and detached from a cgroup and a cgroup may not have csses for all the controllers associated with the hierarchy. When a cgroup doesn't have its own css for a given controller, the css of the nearest ancestor with the controller enabled will be used, which is called the effective css. This patch introduces cgroup_e_css() and for_each_e_css() to access the effective csses and convert compare_css_sets(), find_existing_css_set() and cgroup_migrate() to use the effective csses so that they can handle cgroups with partial csses correctly. This means that for two css_sets to be considered identical, they should have both matching csses and cgroups. compare_css_sets() already compares both, not for correctness but for optimization. As this now becomes a matter of correctness, update the comments accordingly. For all !default hierarchies, cgroup_e_css() always equals cgroup_css(), so this patch doesn't change behavior. While at it, fix incorrect locking comment for for_each_css(). Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
94419627 ("cgroup: move ->subsys_mask from cgroupfs_root to cgroup") moved ->subsys_mask from cgroup_root to cgroup to prepare for the unified hierarhcy; however, it turns out that carrying the subsys_mask of the children in the parent, instead of itself, is a lot more natural. This patch restores cgroup_root->subsys_mask and morphs cgroup->subsys_mask into cgroup->child_subsys_mask. * Uses of root->cgrp.subsys_mask are restored to root->subsys_mask. * Remove automatic setting and clearing of cgrp->subsys_mask and instead just inherit ->child_subsys_mask from the parent during cgroup creation. Note that this doesn't affect any current behaviors. * Undo __kill_css() separation. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
cgroup_apply_cftypes() skip creating or removing files if the subsystem is attached to the default hierarchy, which led to missing files in the root of the default hierarchy. Skipping made sense when the default hierarchy was dummy; however, now that the default hierarchy is full functional and planned to be used as the unified hierarchy, it shouldn't be skipped over. Reported-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
- 17 4月, 2014 1 次提交
-
-
由 Li Zefan 提交于
If we hit the retry path, we'll call parse_cgroupfs_options() again, but the string we pass to it has been modified by the previous call to this function. This bug can be observed by: # mount -t cgroup -o name=foo,cpuset xxx /mnt && umount /mnt && \ mount -t cgroup -o name=foo,cpuset xxx /mnt mount: wrong fs type, bad option, bad superblock on xxx, missing codepage or helper program, or other error ... The second mount passed "name=foo,cpuset" to the parser, and then it hit the retry path and call the parser again, but this time the string passed to the parser is "name=foo". To fix this, we avoid calling parse_cgroupfs_options() again in this case. Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 08 4月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
While converting cgroup to kernfs, 2bd59d48 ("cgroup: convert to kernfs") accidentally dropped the logic which makes newly created cgroup dirs and files owned by the current uid / gid. This broke cases where cgroup subtree management is delegated to !root as the sub manager wouldn't be able to create more than single level of hierarchy or put tasks into child cgroups it created. Among other things, this breaks user session management in systemd and one of the symptoms was 90s hang during shutdown. User session systemd running as the user creates a sub-service to initiate shutdown and tries to put kill(1) into it but fails because cgroup.procs is owned by root. This leads to 90s hang during shutdown. Implement cgroup_kn_set_ugid() which sets a kn's uid and gid to those of the caller and use it from file and dir creation paths. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 4月, 2014 1 次提交
-
-
由 Li Zefan 提交于
As mount() and kill_sb() is not a one-to-one match, If we mount the same cgroupfs in serveral mount points, and then umount all of them, kill_sb() will be called only once. Try: # mount -t cgroup -o cpuacct xxx /cgroup # mount -t cgroup -o cpuacct xxx /cgroup2 # cat /proc/cgroups | grep cpuacct cpuacct 2 1 1 # umount /cgroup # umount /cgroup2 # cat /proc/cgroups | grep cpuacct cpuacct 2 1 1 You'll see cgroupfs will never be freed. Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 29 3月, 2014 2 次提交
-
-
由 Li Zefan 提交于
Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Li Zefan 提交于
cgroup_exit() is called in fork and exit path. If it's called in the failure path during fork, PF_EXITING isn't set, and then lockdep will complain. Fix this by removing cgroup_exit() in that failure path. cgroup_fork() does nothing that needs cleanup. Reported-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 24 3月, 2014 1 次提交
-
-
由 Monam Agarwal 提交于
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: NMonam Agarwal <monamagarwal123@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 20 3月, 2014 2 次提交
-
-
由 Tejun Heo 提交于
cgroup_tree_mutex should nest above the kernfs active_ref protection; however, cgroup_create() and cgroup_rename() were grabbing cgroup_tree_mutex while under kernfs active_ref protection. This has actualy possibility to lead to deadlocks in case these operations race against cgroup_rmdir() which invokes kernfs_remove() on directory kernfs_node while holding cgroup_tree_mutex. Neither cgroup_create() or cgroup_rename() requires active_ref protection. The former already has enough synchronization through cgroup_lock_live_group() and the latter doesn't care, so this can be fixed by updating both functions to break all active_ref protections before grabbing cgroup_tree_mutex. While this patch fixes the immediate issue, it probably needs further work in the long term - kernfs directories should enable lockdep annotations and maybe the better way to handle this is marking directory nodes as not needing active_ref protection rather than breaking it in each operation. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Tejun Heo 提交于
cgroup_taskset is used to track and iterate target tasks while migrating a task or process and should guarantee that the first task iterated is the task group leader if a process is being migrated. b3dc094e ("cgroup: use css_set->mg_tasks to track target tasks during migration") replaced flex array cgroup_taskset->tc_array with css_set->mg_tasks list to remove process size limit and dynamic allocation during migration; unfortunately, it incorrectly used list operations which don't preserve order breaking the guarantee that cgroup_taskset_first() returns the leader for a process target. Fix it by using order preserving list operations. Note that as multiple src_csets may map to a single dst_cset, the iteration order may change across cgroup_task_migrate(); however, the leader is still guaranteed to be the first entry. The switch to list_splice_tail_init() at the end of cgroup_migrate() isn't strictly necessary. Let's still do it for consistency. Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 19 3月, 2014 10 次提交
-
-
由 Tejun Heo 提交于
This cftype flag makes the file only appear on the default hierarchy. This will later be used for cgroup.controllers file. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
cgrp_dfl_root will be used as the default unified hierarchy. This patch makes cgrp_dfl_root mountable by making the following changes. * cgroup_init_early() now initializes cgrp_dfl_root w/ CGRP_ROOT_SANE_BEHAVIOR. The default hierarchy is always sane. * parse_cgroupfs_options() and cgroup_mount() are updated such that cgrp_dfl_root is mounted if sane_behavior is specified w/o any subsystems. * rebind_subsystems() now populates the root directory of cgrp_dfl_root. Note that the function still guarantees success of rebinding subsystems to cgrp_dfl_root. If populating fails while rebinding to cgrp_dfl_root, it whines but ignores the error. * For backward compatibility, the default hierarchy shows up in /proc/$PID/cgroup only after it's explicitly mounted so that userland which doesn't make use of it doesn't see any change. * "current_css_set_cg_links" file of debug cgroup now treats the default hierarchy the same as other hierarchies. This is visible to userland. Given that it's for debug controller, this should be fine. * While at it, implement cgroup_on_dfl() which tests whether a give cgroup is on the default hierarchy or not. The above changes make cgrp_dfl_root mostly equivalent to other controllers but the actual unified hierarchy behaviors are not implemented yet. Let's plug child cgroup creation in cgrp_dfl_root from create_cgroup() for now. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
cftype->write_string() just passes on the writeable buffer from kernfs and there's no reason to add const restriction on the buffer. The only thing const achieves is unnecessarily complicating parsing of the buffer. Drop const from @buffer. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
-
由 Tejun Heo 提交于
The dummy root will be repurposed to serve as the default unified hierarchy. Let's rename things in preparation. * s/cgroup_dummy_root/cgrp_dfl_root/ * s/cgroupfs_root/cgroup_root/ as we don't do fs part directly anymore * s/cgroup_root->top_cgroup/cgroup_root->cgrp/ for brevity This is pure rename. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
cgroupfs_root->subsys_mask represents the controllers attached to the hierarchy. This patch moves the field to cgroup. Subsystem initialization and rebinding updates the top cgroup's subsys_mask. For !root cgroups, the subsys_mask bits are set from create_css() and cleared from kill_css(), which effectively means that all cgroups will have the same subsys_mask as the top cgroup. While this doesn't make any difference now, this will help implementation of the default unified hierarchy where !root cgroups may have subsets of the top_cgroup's subsys_mask. While at it, __kill_css() is split out of kill_css(). The former doesn't care about the subsys_mask while the latter becomes noop if the controller is already killed and clears the matching bit if not before proceeding to killing the css. This will be used later by the default unified hierarchy implementation. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Currently, while rebinding, cgroup_dummy_root serves as the anchor point. In addition to the target root, rebind_subsystems() takes @added_mask and @removed_mask. The subsystems specified in the former are expected to be on the dummy root and then moved to the target root. The ones in the latter are moved from non-dummy root to dummy. Now that the dummy root is a fully functional one and we're planning to use it for the default unified hierarchy, this level of distinction between dummy and non-dummy roots is quite awkward. This patch updates rebind_subsystems() to take the target root and one subsystem mask and move the specified subsystmes to the target root which may or may not be the dummy root. IOW, unbinding now becomes moving the subsystems to the dummy root and binding to non-dummy root. This makes the dummy root mostly equivalent to other hierarchies in terms of the mechanism of moving subsystems around; however, we still retain all the semantical restrictions so that this patch doesn't introduce any visible behavior differences. Another noteworthy detail is that rebind_subsystems() guarantees that moving a subsystem to the dummy root never fails so that valid unmounting attempts always succeed. This unifies binding and unbinding of subsystems. The invocation points of ->bind() were inconsistent between the two and now moved after whole rebinding is complete. This doesn't break the current users and generally makes more sense. All rebind_subsystems() users are converted accordingly. Note that cgroup_remount() now makes two calls to rebind_subsystems() to bind and then unbind the requested subsystems. This will allow repurposing of the dummy hierarchy as the default unified hierarchy and shouldn't make any userland visible behavior difference. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
cgroup_dummy_root is used to host controllers which aren't attached to any other hierarchy. The root is minimally set up during kernfs bootstrap and didn't go through full hierarchy initialization. We're planning to use cgroup_dummy_root for the default unified hierarchy and thus want it to be fully functional. Replace the special initialization, which was collected into cgroup_init() by the previous patch, with an invocation of cgroup_setup_root(). This simplifies the init path and makes cgroup_dummy_root a full hierarchy with its own kernfs_root and all. As this puts the dummy hierarchy on the cgroup_roots list, rename for_each_active_root() to for_each_root() and update its users to skip the dummy root for now. This patch doesn't cause any userland visible behavior changes at this point. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
* Fields of init_css_set and css_set_count are now set using initializer instead of programmatically from cgroup_init_early(). * init_cgroup_root() now also takes @opts and performs the optional part of initialization too. The leftover part of cgroup_root_from_opts() is collapsed into its only caller - cgroup_mount(). * Initialization of cgroup_root_count and linking of init_css_set are moved from cgroup_init_early() to to cgroup_init(). None of the early_init users depends on init_css_set being linked. * Subsystem initializations are moved after dummy hierarchy init and init_css_set linking. These changes reorganize the bootstrap logic so that the dummy hierarchy can share the usual hierarchy init path and be made more normal. These changes don't make noticeable behavior changes. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
In cgroup_destroy_locked(), move setting of CGRP_DEAD above invocations of kill_css(). This doesn't make any visible behavior difference now but will be used to inhibit manipulating controller enable states of a dying cgroup on the unified hierarchy. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Li Zefan 提交于
If online_css() fails, we should remove cgroup files belonging to css->ss. Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 25 2月, 2014 4 次提交
-
-
由 Tejun Heo 提交于
cgroup_transfer_tasks() can currently fail in the middle due to memory allocation failure. When that happens, the function just aborts and returns error code and there's no way to tell how many actually got migrated at the point of failure and or to revert the partial migration. Update it to use cgroup_migrate{_add_src|prepare_dst|migrate|finish}() so that the function either succeeds or fails as a whole as long as ->can_attach() doesn't fail. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
For optimization, task_lock() is additionally used to protect task->cgroups. The optimization is pretty dubious as either css_set_rwsem is grabbed anyway or PF_EXITING already protects task->cgroups. It adds only overhead and confusion at this point. Let's drop task_[un]lock() and update comments accordingly. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
When a new process is forked, cgroup_fork() associates it with the css_set of its parent but doesn't link it into it. After the new process is linked to tasklist, cgroup_post_fork() does the linking. This is problematic for cgroup_transfer_tasks() as there's no way to tell whether there are tasks which are pointing to a css_set but not linked yet. It is impossible to implement an operation which transfer all tasks of a cgroup to another and the current cgroup_transfer_tasks() can easily be tricked into leaving a newly forked process behind if it gets called between cgroup_fork() and cgroup_post_fork(). Let's make association with a css_set and linking atomic by moving it to cgroup_post_fork(). cgroup_fork() sets child->cgroups to init_css_set as a placeholder and cgroup_post_fork() is updated to perform both the association with the parent's cgroup and linking there. This means that a newly created task will point to init_css_set without holding a ref to it much like what it does on the exit path. Empty cg_list is used to indicate that the task isn't holding a ref to the associated css_set. This fixes an actual bug with cgroup_transfer_tasks(); however, I'm not marking it for -stable. The whole thing is broken in multiple other ways which require invasive updates to fix and I don't think it's worthwhile to bother with backporting this particular one. Fortunately, the only user is cpuset and these bugs don't crash the machine. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
Currently, process / task migration is a single operation which may fail depending on memory pressure or the involved controllers' ->can_attach() callbacks. One problem with this approach is migration of multiple targets. It's impossible to tell whether a given target will be successfully migrated beforehand and cgroup core can't keep track of enough states to roll back after intermediate failure. This is already an issue with cgroup_transfer_tasks(). Also, we're gonna need multiple target migration for unified hierarchy. This patch splits migration into four stages - cgroup_migrate_add_src(), cgroup_migrate_prepare_dst(), cgroup_migrate() and cgroup_migrate_finish(), where cgroup_migrate_prepare_dst() performs all the operations which may fail due to allocation failure without actually migrating the target. The four separate stages mean that, disregarding ->can_attach() failures, the success or failure of multi target migration can be determined before performing any actual migration. If preparations of all targets succeed, the whole thing will succeed. If not, the whole operation can fail without any side-effect. Since the previous patch to use css_set->mg_tasks to keep track of migration targets, the only thing which may need memory allocation during migration is the target css_sets. cgroup_migrate_prepare() pins all source and target css_sets and link them up. Note that this can be performed without holding threadgroup_lock even if the target is a process. As long as cgroup_mutex is held, no new css_set can be put into play. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-