- 22 3月, 2012 1 次提交
-
-
由 Hugh Dickins 提交于
Commit c1e2ee2d ("memcg: replace ss->id_lock with a rwlock") has now been seen to cause the unfair behavior we should have expected from converting a spinlock to an rwlock: softlockup in cgroup_mkdir(), whose get_new_cssid() is waiting for the wlock, while there are 19 tasks using the rlock in css_get_next() to get on with their memcg workload (in an artificial test, admittedly). Yet lib/idr.c was made suitable for RCU way back: revert that commit, restoring ss->id_lock to a spinlock. Signed-off-by: NHugh Dickins <hughd@google.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 2月, 2012 2 次提交
-
-
由 Frederic Weisbecker 提交于
Walking through the tasklist in cgroup_enable_task_cg_list() inside an RCU read side critical section is not enough because: - RCU is not (yet) safe against while_each_thread() - If we use only RCU, a forking task that has passed cgroup_post_fork() without seeing use_task_css_set_links == 1 is not guaranteed to have its child immediately visible in the tasklist if we walk through it remotely with RCU. In this case it will be missing in its css_set's task list. Thus we need to traverse the list (unfortunately) under the tasklist_lock. It makes us safe against while_each_thread() and also make sure we see all forked task that have been added to the tasklist. As a secondary effect, reading and writing use_task_css_set_links are now well ordered against tasklist traversing and modification. The new layout is: CPU 0 CPU 1 use_task_css_set_links = 1 write_lock(tasklist_lock) read_lock(tasklist_lock) add task to tasklist do_each_thread() { write_unlock(tasklist_lock) add thread to css set links if (use_task_css_set_links) } while_each_thread() add thread to css set links read_unlock(tasklist_lock) If CPU 0 traverse the list after the task has been added to the tasklist then it is correctly added to the css set links. OTOH if CPU 0 traverse the tasklist before the new task had the opportunity to be added to the tasklist because it was too early in the fork process, then CPU 1 catches up and add the task to the css set links after it added the task to the tasklist. The right value of use_task_css_set_links is guaranteed to be visible from CPU 1 due to the LOCK/UNLOCK implicit barrier properties: the read_unlock on CPU 0 makes the write on use_task_css_set_links happening and the write_lock on CPU 1 make the read of use_task_css_set_links that comes afterward to return the correct value. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> -
由 Frederic Weisbecker 提交于
Remove the stale comment about RCU protection. Many callers (all of them?) of cgroup_enable_task_cg_list() don't seem to be in an RCU read side critical section. Besides, RCU is not helpful to protect against while_each_thread(). Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 03 2月, 2012 1 次提交
-
-
由 Li Zefan 提交于
The argument is not used at all, and it's not necessary, because a specific callback handler of course knows which subsys it belongs to. Now only ->pupulate() takes this argument, because the handlers of this callback always call cgroup_add_file()/cgroup_add_files(). So we reduce a few lines of code, though the shrinking of object size is minimal. 16 files changed, 113 insertions(+), 162 deletions(-) text data bss dec hex filename 5486240 656987 7039960 13183187 c928d3 vmlinux.o.orig 5486170 656987 7039960 13183117 c9288d vmlinux.o Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 31 1月, 2012 1 次提交
-
-
由 Mandeep Singh Baines 提交于
In cgroup_attach_proc, we indirectly call find_existing_css_set 3 times. It is an expensive call so we want to call it a minimum of times. This patch only calls it once and stores the result so that it can be used later on when we call cgroup_task_migrate. This required modifying cgroup_task_migrate to take the new css_set (which we obtained from find_css_set) as a parameter. The nice side effect of this is that cgroup_task_migrate is now identical for cgroup_attach_task and cgroup_attach_proc. It also now returns a void since it can never fail. Changes in V5: * https://lkml.org/lkml/2012/1/20/344 (Tejun Heo) * Remove css_set_refs Changes in V4: * https://lkml.org/lkml/2011/12/22/421 (Li Zefan) * Avoid GFP_KERNEL (sleep) in rcu_read_lock by getting css_set in a separate loop not under an rcu_read_lock Changes in V3: * https://lkml.org/lkml/2011/12/22/13 (Li Zefan) * Fixed earlier bug by creating a seperate patch to remove tasklist_lock Changes in V2: * https://lkml.org/lkml/2011/12/20/372 (Tejun Heo) * Move find_css_set call into loop which creates the flex array * Author * Kill css_set_refs and use group_size instead * Fix an off-by-one error in counting css_set refs * Add a retval check in out_list_teardown Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: containers@lists.linux-foundation.org Cc: cgroups@vger.kernel.org Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org>
-
- 21 1月, 2012 3 次提交
-
-
由 Mandeep Singh Baines 提交于
We can replace the tasklist_lock in cgroup_attach_proc with an rcu_read_lock(). Changes in V4: * https://lkml.org/lkml/2011/12/23/284 (Frederic Weisbecker) * Minimize size of rcu_read_lock critical section * Add comment * https://lkml.org/lkml/2011/12/26/136 (Li Zefan) * Split into two patches Changes in V3: * https://lkml.org/lkml/2011/12/22/419 (Frederic Weisbecker) * Add an rcu_read_lock to protect against exit Changes in V2: * https://lkml.org/lkml/2011/12/22/86 (Tejun Heo) * Use a goto instead of returning -EAGAIN Suggested-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: containers@lists.linux-foundation.org Cc: cgroups@vger.kernel.org Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org>
-
由 Mandeep Singh Baines 提交于
To keep the complexity of the double-check locking in one place, move the thread_group_leader check up into attach_task_by_pid(). This allows us to use a goto instead of returning -EAGAIN. While at it, convert a couple of returns to gotos and use rcu for the !pid case also in order to simplify the logic. Changes in V2: * https://lkml.org/lkml/2011/12/22/86 (Tejun Heo) * Use a goto instead of returning -EAGAIN Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: containers@lists.linux-foundation.org Cc: cgroups@vger.kernel.org Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org>
-
由 Li Zefan 提交于
It's internally used only. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 07 1月, 2012 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 06 1月, 2012 1 次提交
-
-
由 Li Zefan 提交于
If we mount a hierarchy with a specified name, the name is unique, and we can use it to mount the hierarchy without specifying its set of subsystem names. This feature is documented is Documentation/cgroups/cgroups.txt section 2.3 Here's an example: # mount -t cgroup -o cpuset,name=myhier xxx /cgroup1 # mount -t cgroup -o name=myhier xxx /cgroup2 But it was broken by commit 32a8cf23 (cgroup: make the mount options parsing more accurate) This fixes the regression. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org
-
- 04 1月, 2012 3 次提交
-
-
由 Dan Carpenter 提交于
Gcc complains about this: "kernel/cgroup.c:2179:4: warning: suggest parentheses around assignment used as truth value [-Wparentheses]" Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> -
由 Al Viro 提交于
vfs_mkdir() gets int, but immediately drops everything that might not fit into umode_t and that's the only caller of ->mkdir()... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 28 12月, 2011 3 次提交
-
-
由 Frederic Weisbecker 提交于
cgroup_post_fork() is protected between threadgroup_change_begin() and threadgroup_change_end() against concurrent changes of the child's css_set in cgroup_task_migrate(). Also the child can't exit and call cgroup_exit() at this stage, this means it's css_set can't be changed with init_css_set concurrently. For these reasons, we don't need to hold task_lock() on the child because it's css_set can only remain stable in this place. Let's remove the lock there. v2: Update comment to explain that we are safe against cgroup_exit() Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Containers <containers@lists.linux-foundation.org> Cc: Cgroups <cgroups@vger.kernel.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org> Cc: Mandeep Singh Baines <msb@chromium.org>
-
由 Kirill A. Shutemov 提交于
Signed-off-by: NKirill A. Shutemov <kirill@shutemov.name> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Kirill A. Shutemov 提交于
Signed-off-by: NKirill A. Shutemov <kirill@shutemov.name> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 22 12月, 2011 5 次提交
-
-
由 Mandeep Singh Baines 提交于
In cgroup_attach_proc it is now sufficient to only check that oldcgrp==newcgrp once. Now that we are using threadgroup_lock() during the migrations, oldcgrp will not change. Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: containers@lists.linux-foundation.org Cc: cgroups@vger.kernel.org Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org>
-
由 Mandeep Singh Baines 提交于
threadgroup_lock() guarantees that the target threadgroup will remain stable - no new task will be added, no new PF_EXITING will be set and exec won't happen. Changes in V2: * https://lkml.org/lkml/2011/12/20/369 (Tejun Heo) * Undo incorrect removal of get/put from attach_task_by_pid() * Author * Remove a comment which is made stale by this change Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: containers@lists.linux-foundation.org Cc: cgroups@vger.kernel.org Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org>
-
由 Mandeep Singh Baines 提交于
We can now assume that the css_set reference held by the task will not go away for an exiting task. PF_EXITING state can be trusted throughout migration by checking it after locking threadgroup. Changes in V4: * https://lkml.org/lkml/2011/12/20/368 (Tejun Heo) * Fix typo in commit message * Undid the rename of css_set_check_fetched * https://lkml.org/lkml/2011/12/20/427 (Li Zefan) * Fix comment in cgroup_task_migrate() Changes in V3: * https://lkml.org/lkml/2011/12/20/255 (Frederic Weisbecker) * Fixed to put error in retval Changes in V2: * https://lkml.org/lkml/2011/12/19/289 (Tejun Heo) * Updated commit message -tj: removed stale patch description about dropped function rename. Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: containers@lists.linux-foundation.org Cc: cgroups@vger.kernel.org Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org>
-
由 Frederic Weisbecker 提交于
When we fetch the css_set of the tasks on cgroup migration, we don't need anymore to synchronize against cgroup_exit() that could swap the old one with init_css_set. Now that we are using threadgroup_lock() during the migrations, we don't need to worry about it anymore. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Reviewed-by: NMandeep Singh Baines <msb@chromium.org> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Containers <containers@lists.linux-foundation.org> Cc: Cgroups <cgroups@vger.kernel.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org>
-
由 Frederic Weisbecker 提交于
We don't need to hold the parent task_lock() on the parent in cgroup_fork() because we are already synchronized against the two places that may change the parent css_set concurrently: - cgroup_exit(), but the parent obviously can't exit concurrently - cgroup migration: we are synchronized against threadgroup_lock() So we can safely remove the task_lock() there. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Containers <containers@lists.linux-foundation.org> Cc: Cgroups <cgroups@vger.kernel.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org> Cc: Mandeep Singh Baines <msb@chromium.org>
-
- 20 12月, 2011 2 次提交
-
-
由 Mandeep Singh Baines 提交于
We already have a reference to all elements in newcg_list. Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: containers@lists.linux-foundation.org Cc: cgroups@vger.kernel.org Cc: Paul Menage <paul@paulmenage.org>
-
由 Mandeep Singh Baines 提交于
There is a BUG when migrating a PF_EXITING proc. Since css_set_prefetch() is not called for the PF_EXITING case, find_existing_css_set() will return NULL inside cgroup_task_migrate() causing a BUG. This bug is easy to reproduce. Create a zombie and echo its pid to cgroup.procs. $ cat zombie.c \#include <unistd.h> int main() { if (fork()) pause(); return 0; } $ We are hitting this bug pretty regularly on ChromeOS. This bug is already fixed by Tejun Heo's cgroup patchset which is targetted for the next merge window: https://lkml.org/lkml/2011/11/1/356 I've create a smaller patch here which just fixes this bug so that a fix can be merged into the current release and stable. Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Downstream-Bug-Report: http://crosbug.com/23953Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: containers@lists.linux-foundation.org Cc: cgroups@vger.kernel.org Cc: stable@kernel.org Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org> Cc: Olof Johansson <olofj@chromium.org>
-
- 13 12月, 2011 6 次提交
-
-
由 Tejun Heo 提交于
These three methods are no longer used. Kill them. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NPaul Menage <paul@paulmenage.org> Cc: Li Zefan <lizf@cn.fujitsu.com>
-
由 Tejun Heo 提交于
Currently, there's no way to pass multiple tasks to cgroup_subsys methods necessitating the need for separate per-process and per-task methods. This patch introduces cgroup_taskset which can be used to pass multiple tasks and their associated cgroups to cgroup_subsys methods. Three methods - can_attach(), cancel_attach() and attach() - are converted to use cgroup_taskset. This unifies passed parameters so that all methods have access to all information. Conversions in this patchset are identical and don't introduce any behavior change. -v2: documentation updated as per Paul Menage's suggestion. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NPaul Menage <paul@paulmenage.org> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: James Morris <jmorris@namei.org>
-
由 Tejun Heo 提交于
cgroup_attach_proc() behaves differently from cgroup_attach_task() in the following aspects. * All hooks are invoked even if no task is actually being moved. * ->can_attach_task() is called for all tasks in the group whether the new cgrp is different from the current cgrp or not; however, ->attach_task() is skipped if new equals new. This makes the calls asymmetric. This patch improves old cgroup handling in cgroup_attach_proc() by looking up the current cgroup at the head, recording it in the flex array along with the task itself, and using it to remove the above two differences. This will also ease further changes. -v2: nr_todo renamed to nr_migrating_tasks as per Paul Menage's suggestion. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NPaul Menage <paul@paulmenage.org> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> -
由 Tejun Heo 提交于
Update cgroup to take advantage of the fack that threadgroup_lock() guarantees stable threadgroup. * Lock threadgroup even if the target is a single task. This guarantees that when the target tasks stay stable during migration regardless of the target type. * Remove PF_EXITING early exit optimization from attach_task_by_pid() and check it in cgroup_task_migrate() instead. The optimization was for rather cold path to begin with and PF_EXITING state can be trusted throughout migration by checking it after locking threadgroup. * Don't add PF_EXITING tasks to target task array in cgroup_attach_proc(). This ensures that task migration is performed only for live tasks. * Remove -ESRCH failure path from cgroup_task_migrate(). With the above changes, it's guaranteed to be called only for live tasks. After the changes, only live tasks are migrated and they're guaranteed to stay alive until migration is complete. This removes problems caused by exec and exit racing against cgroup migration including symmetry among cgroup attach methods and different cgroup methods racing each other. v2: Oleg pointed out that one more PF_EXITING check can be removed from cgroup_attach_proc(). Removed. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org> -
由 Tejun Heo 提交于
Make the following renames to prepare for extension of threadgroup locking. * s/signal->threadgroup_fork_lock/signal->group_rwsem/ * s/threadgroup_fork_read_lock()/threadgroup_change_begin()/ * s/threadgroup_fork_read_unlock()/threadgroup_change_end()/ * s/threadgroup_fork_write_lock()/threadgroup_lock()/ * s/threadgroup_fork_write_unlock()/threadgroup_unlock()/ This patch doesn't cause any behavior change. -v2: Rename threadgroup_change_done() to threadgroup_change_end() per KAMEZAWA's suggestion. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org> -
由 Tejun Heo 提交于
cgroup wants to make threadgroup stable while modifying cgroup hierarchies which will introduce locking dependency on cred_guard_mutex from cgroup_mutex. This unfortunately completes circular dependency. A. cgroup_mutex -> cred_guard_mutex -> s_type->i_mutex_key -> namespace_sem B. namespace_sem -> cgroup_mutex B is from cgroup_show_options() and this patch breaks it by introducing another mutex cgroup_root_mutex which nests inside cgroup_mutex and protects cgroupfs_root. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com>
-
- 03 11月, 2011 3 次提交
-
-
由 Andrew Bresticker 提交于
While back-porting Johannes Weiner's patch "mm: memcg-aware global reclaim" for an internal effort, we noticed a significant performance regression during page-reclaim heavy workloads due to high contention of the ss->id_lock. This lock protects idr map, and serializes calls to idr_get_next() in css_get_next() (which is used during the memcg hierarchy walk). Since idr_get_next() is just doing a look up, we need only serialize it with respect to idr_remove()/idr_get_new(). By making the ss->id_lock a rwlock, contention is greatly reduced and performance improves. Tested: cat a 256m file from a ramdisk in a 128m container 50 times on each core (one file + container per core) in parallel on a NUMA machine. Result is the time for the test to complete in 1 of the containers. Both kernels included Johannes' memcg-aware global reclaim patches. Before rwlock patch: 1710.778s After rwlock patch: 152.227s Signed-off-by: NAndrew Bresticker <abrestic@google.com> Cc: Paul Menage <menage@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ying Han <yinghan@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ben Blum 提交于
If a task has exited to the point it has called cgroup_exit() already, then we can't migrate it to another cgroup anymore. This can happen when we are attaching a task to a new cgroup between the call to ->can_attach_task() on subsystems and the migration that is eventually tried in cgroup_task_migrate(). In this case cgroup_task_migrate() returns -ESRCH and we don't want to attach the task to the subsystems because the attachment to the new cgroup itself failed. Fix this by only calling ->attach_task() on the subsystems if the cgroup migration succeeded. Reported-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NBen Blum <bblum@andrew.cmu.edu> Acked-by: NPaul Menage <paul@paulmenage.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ben Blum 提交于
Fix unstable tasklist locking in cgroup_attach_proc. According to this thread - https://lkml.org/lkml/2011/7/27/243 - RCU is not sufficient to guarantee the tasklist is stable w.r.t. de_thread and exit. Taking tasklist_lock for reading, instead of rcu_read_lock, ensures proper exclusion. Signed-off-by: NBen Blum <bblum@andrew.cmu.edu> Acked-by: NPaul Menage <paul@paulmenage.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 9月, 2011 1 次提交
-
-
由 Thomas Gleixner 提交于
The release_list_lock can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 27 7月, 2011 1 次提交
-
-
由 Arun Sharma 提交于
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: NArun Sharma <asharma@fb.com> Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: NMike Frysinger <vapier@gentoo.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 7月, 2011 1 次提交
-
-
由 Al Viro 提交于
convert the last remaining caller to inode_permission() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 09 7月, 2011 1 次提交
-
-
由 Michal Hocko 提交于
Since ca5ecddf (rcu: define __rcu address space modifier for sparse) rcu_dereference_check use rcu_read_lock_held as a part of condition automatically so callers do not have to do that as well. Signed-off-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 09 6月, 2011 1 次提交
-
-
由 eparis@redhat 提交于
We recently found that in some configurations SELinux was blocking the ability for cgroupfs to be mounted. The reason for this is because cgroupfs creates files and directories during the get_sb() call and also uses lookup_one_len() during that same get_sb() call. This is a problem since the security subsystem cannot initialize the superblock and the inodes in that filesystem until after the get_sb() call returns. Thus we leave the inodes in an unitialized state during get_sb(). For the vast majority of filesystems this is not an issue, but since cgroupfs uses lookup_on_len() it does search permission checks on the directories in the path it walks. Since the inode security state is not set up SELinux does these checks as if the inodes were 'unlabeled.' Many 'normal' userspace process do not have permission to interact with unlabeled inodes. The solution presented here is to do the permission checks of path walk and inode creation as the kernel rather than as the task that called mount. Since the kernel has permission to read/write/create unlabeled inodes the get_sb() call will complete successfully and the SELinux code will be able to initialize the superblock and those inodes created during the get_sb() call. This appears to be the same solution used by other filesystems such as devtmpfs to solve the same issue and should thus have no negative impact on other LSMs which currently work. Signed-off-by: NEric Paris <eparis@redhat.com> Acked-by: NPaul Menage <menage@google.com> Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 27 5月, 2011 3 次提交
-
-
由 Daniel Lezcano 提交于
The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and leads to some problems: * cgroup creation is out-of-control * cgroup name can conflict when pids are looping * it is not possible to have a single process handling a lot of namespaces without falling in a exponential creation time * we may want to create a namespace without creating a cgroup The ns_cgroup was replaced by a compatibility flag 'clone_children', where a newly created cgroup will copy the parent cgroup values. The userspace has to manually create a cgroup and add a task to the 'tasks' file. This patch removes the ns_cgroup as suggested in the following thread: https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html The 'cgroup_clone' function is removed because it is no longer used. This is a userspace-visible change. Commit 45531757 ("cgroup: notify ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a printk warning users that the feature is planned for removal. Since that time we have heard from XXX users who were affected by this. Signed-off-by: NDaniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Jamal Hadi Salim <hadi@cyberus.ca> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Acked-by: NMatt Helsley <matthltc@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> -
由 Ben Blum 提交于
Convert cgroup_attach_proc to use flex_array. The cgroup_attach_proc implementation requires a pre-allocated array to store task pointers to atomically move a thread-group, but asking for a monolithic array with kmalloc() may be unreliable for very large groups. Using flex_array provides the same functionality with less risk of failure. This is a post-patch for cgroup-procs-write.patch. Signed-off-by: NBen Blum <bblum@andrew.cmu.edu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ben Blum 提交于
Make procs file writable to move all threads by tgid at once. Add functionality that enables users to move all threads in a threadgroup at once to a cgroup by writing the tgid to the 'cgroup.procs' file. This current implementation makes use of a per-threadgroup rwsem that's taken for reading in the fork() path to prevent newly forking threads within the threadgroup from "escaping" while the move is in progress. Signed-off-by: NBen Blum <bblum@andrew.cmu.edu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-