提交 · f50daa704f36a6544a902c52b6cf37b0493dfc5d · openanolis / cloud-kernel

05 3月, 2013 2 次提交

cgroup: no need to check css refs for release notification · f50daa70

由 Li Zefan 提交于 3月 01, 2013

We no longer fail rmdir() when there're still css refs, so we don't
need to check css refs in check_for_release().

This also voids a bug. cgroup_has_css_refs() accesses subsys[i]
without cgroup_mutex, so it can race with cgroup_unload_subsys().

cgroup_has_css_refs()
...
  if (ss == NULL || ss->root != cgrp->root)

if ss pointers to net_cls_subsys, and cls_cgroup module is unloaded
right after the former check but before the latter, the memory that
net_cls_subsys resides has become invalid.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

f50daa70

cgroup: fix cgroup_path() vs rename() race · 65dff759

由 Li Zefan 提交于 3月 01, 2013

rename() will change dentry->d_name. The result of this race can
be worse than seeing partially rewritten name, but we might access
a stale pointer because rename() will re-allocate memory to hold
a longer name.

As accessing dentry->name must be protected by dentry->d_lock or
parent inode's i_mutex, while on the other hand cgroup-path() can
be called with some irq-safe spinlocks held, we can't generate
cgroup path using dentry->d_name.

Alternatively we make a copy of dentry->d_name and save it in
cgrp->name when a cgroup is created, and update cgrp->name at
rename().

v5: use flexible array instead of zero-size array.
v4: - allocate root_cgroup_name and all root_cgroup->name points to it.
    - add cgroup_name() wrapper.
v3: use kfree_rcu() instead of synchronize_rcu() in user-visible path.
v2: make cgrp->name RCU safe.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

65dff759

28 2月, 2013 3 次提交

hlist: drop the node parameter from iterators · b67bfe0d

由 Sasha Levin 提交于 2月 27, 2013

I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b67bfe0d

cgroup: convert to idr_alloc() · d228d9ec

由 Tejun Heo 提交于 2月 27, 2013

Convert to the much saner new idr interface.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d228d9ec

cgroup: don't use idr_remove_all() · c897ff68

由 Tejun Heo 提交于 2月 27, 2013

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c897ff68

23 2月, 2013 1 次提交
- A
  new helper: file_inode(file) · 496ad9aa
  由 Al Viro 提交于 1月 23, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  496ad9aa
19 2月, 2013 3 次提交

cgroup: fail if monitored file and event_control are in different cgroup · f169007b

由 Li Zefan 提交于 2月 18, 2013

If we pass fd of memory.usage_in_bytes of cgroup A to cgroup.event_control
of cgroup B, then we won't get memory usage notification from A but B!

What's worse, if A and B are in different mount hierarchy, we'll end up
accessing NULL pointer!

Disallow this kind of invalid usage.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NKirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: NTejun Heo <tj@kernel.org>

f169007b

cgroup: fix cgroup_rmdir() vs close(eventfd) race · 810cbee4

由 Li Zefan 提交于 2月 18, 2013

commit 205a872b ("cgroup: fix lockdep
warning for event_control") solved a deadlock by introducing a new
bug.

Move cgrp->event_list to a temporary list doesn't mean you can traverse
this list locklessly, because at the same time cgroup_event_wake() can
be called and remove the event from the list. The result of this race
is disastrous.

We adopt the way how kvm irqfd code implements race-free event removal,
which is now described in the comments in cgroup_event_wake().

v3:
- call eventfd_signal() no matter it's eventfd close or cgroup removal
that removes the cgroup event.
Acked-by: NKirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

810cbee4

cgroup: fix exit() vs rmdir() race · 71b5707e

由 Li Zefan 提交于 1月 24, 2013

In cgroup_exit() put_css_set_taskexit() is called without any lock,
which might lead to accessing a freed cgroup:

thread1                           thread2
---------------------------------------------
exit()
  cgroup_exit()
    put_css_set_taskexit()
      atomic_dec(cgrp->count);
                                   rmdir();
      /* not safe !! */
      check_for_release(cgrp);

rcu_read_lock() can be used to make sure the cgroup is alive.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org

71b5707e

25 1月, 2013 5 次提交

cgroup: remove bogus comments in cgroup_diput() · 9ed8a659

由 Li Zefan 提交于 1月 24, 2013

Since commit 48ddbe19
("cgroup: make css->refcnt clearing on cgroup removal optional"),
each css holds a ref on cgroup's dentry, so cgroup_diput() won't be
called until all css' refs go down to 0, which invalids the comments.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

9ed8a659

cgroup: remove synchronize_rcu() from cgroup_diput() · be445626

由 Li Zefan 提交于 1月 24, 2013

Free cgroup via call_rcu(). The actual work is done through
workqueue.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

be445626

cgroup: remove duplicate RCU free on struct cgroup · 86a3db56

由 Li Zefan 提交于 1月 24, 2013

When destroying a cgroup, though in cgroup_diput() we've called
synchronize_rcu(), we then still have to free it via call_rcu().

The story is, long ago to fix a race between reading /proc/sched_debug
and freeing cgroup, the code was changed to utilize call_rcu(). See
commit a47295e6 ("cgroups: make
cgroup_path() RCU-safe")

As we've fixed cpu cgroup that cpu_cgroup_offline_css() is used
to unregister a task_group so there won't be concurrent access
to this task_group after synchronize_rcu() in diput(). Now we can
just kfree(cgrp).
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

86a3db56

cgroup: initialize cgrp->dentry before css_alloc() · fe1c06ca

由 Li Zefan 提交于 1月 24, 2013

With this change, we're guaranteed that cgroup_path() won't see NULL
cgrp->dentry, and thus we can remove the NULL check in it.

(Well, it's not strictly true, because dummptop.dentry is always NULL
 but we already handle that separately.)
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

fe1c06ca

cgroup: remove a NULL check in cgroup_exit() · b5d646f5

由 Li Zefan 提交于 1月 24, 2013

init_task.cgroups is initialized at boot phase, and whenver a ask
is forked, it's cgroups pointer is inherited from its parent, and
it's never set to NULL afterwards.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

b5d646f5

23 1月, 2013 1 次提交

cgroup: fix bogus kernel warnings when cgroup_create() failed · 2739d3cc

由 Li Zefan 提交于 1月 21, 2013

If cgroup_create() failed and cgroup_destroy_locked() is called to
do cleanup, we'll see a bunch of warnings:

cgroup_addrm_files: failed to remove 2MB.limit_in_bytes, err=-2
cgroup_addrm_files: failed to remove 2MB.usage_in_bytes, err=-2
cgroup_addrm_files: failed to remove 2MB.max_usage_in_bytes, err=-2
cgroup_addrm_files: failed to remove 2MB.failcnt, err=-2
cgroup_addrm_files: failed to remove prioidx, err=-2
cgroup_addrm_files: failed to remove ifpriomap, err=-2
...

We failed to remove those files, because cgroup_create() has failed
before creating those cgroup files.

To fix this, we simply don't warn if cgroup_rm_file() can't find the
cft entry.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

2739d3cc

15 1月, 2013 2 次提交

cgroup: remove synchronize_rcu() from rebind_subsystems() · 130e3695

由 Li Zefan 提交于 1月 14, 2013

Nothing's protected by RCU in rebind_subsystems(), and I can't think
of a reason why it is needed.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

130e3695

cgroup: remove synchronize_rcu() from cgroup_attach_{task|proc}() · 5d65bc0c

由 Li Zefan 提交于 1月 14, 2013

These 2 syncronize_rcu()s make attaching a task to a cgroup
quite slow, and it can't be ignored in some situations.

A real case from Colin Cross: Android uses cgroups heavily to
manage thread priorities, putting threads in a background group
with reduced cpu.shares when they are not visible to the user,
and in a foreground group when they are. Some RPCs from foreground
threads to background threads will temporarily move the background
thread into the foreground group for the duration of the RPC.
This results in many calls to cgroup_attach_task.

In cgroup_attach_task() it's task->cgroups that is protected by RCU,
and put_css_set() calls kfree_rcu() to free it.

If we remove this synchronize_rcu(), there can be threads in RCU-read
sections accessing their old cgroup via current->cgroups with
concurrent rmdir operation, but this is safe.

 # time for ((i=0; i<50; i++)) { echo $$ > /mnt/sub/tasks; echo $$ > /mnt/tasks; }

real    0m2.524s
user    0m0.008s
sys     0m0.004s

With this patch:

real    0m0.004s
user    0m0.004s
sys     0m0.000s

tj: These synchronize_rcu()s are utterly confused.  synchornize_rcu()
    necessarily has to come between two operations to guarantee that
    the changes made by the former operation are visible to all rcu
    readers before proceeding to the latter operation.  Here,
    synchornize_rcu() are at the end of attach operations with nothing
    beyond it.  Its only effect would be delaying completion of
    write(2) to sysfs tasks/procs files until all rcu readers see the
    change, which doesn't mean anything.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NColin Cross <ccross@google.com>

5d65bc0c

11 1月, 2013 1 次提交

cgroup: use new hashtable implementation · 0ac801fe

由 Li Zefan 提交于 1月 10, 2013

Switch cgroup to use the new hashtable implementation. No functional changes.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

0ac801fe

08 1月, 2013 1 次提交

cgroup: implement cgroup_rightmost_descendant() · 12a9d2fe

由 Tejun Heo 提交于 1月 07, 2013

Implement cgroup_rightmost_descendant() which returns the right most
descendant of the specified cgroup.  This can be used to skip the
cgroup's subtree while iterating with
cgroup_for_each_descendant_pre().
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NLi Zefan <lizefan@huawei.com>

12a9d2fe

18 12月, 2012 1 次提交

kernel: remove reference to feature-removal-schedule.txt · 8ec7d50f

由 Tao Ma 提交于 12月 17, 2012

In commit 9c0ece06 ("Get rid of Documentation/feature-removal.txt"),
Linus removed feature-removal-schedule.txt from Documentation, but there
is still some reference to this file.  So remove them.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8ec7d50f

07 12月, 2012 1 次提交

cgroup_rm_file: don't delete the uncreated files · f33fddc2

由 Gao feng 提交于 12月 06, 2012

in cgroup_add_file,when creating files for cgroup,
some of creation may be skipped. So we need to avoid
deleting these uncreated files in cgroup_rm_file,
otherwise the warning msg will be triggered.

"cgroup_addrm_files: failed to remove memory_pressure_enabled, err=-2"
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@redhat.com>
Cc: stable@vger.kernel.org

f33fddc2

04 12月, 2012 1 次提交

cgroup: remove subsystem files when remounting cgroup · 7083d037

由 Gao feng 提交于 12月 03, 2012

cgroup_clear_directroy is called by cgroup_d_remove_dir
and cgroup_remount.

when we call cgroup_remount to remount the cgroup,the subsystem
may be unlinked from cgroupfs_root->subsys_list in rebind_subsystem,this
subsystem's files will not be removed in cgroup_clear_directroy.
And the system will panic when we try to access these files.

this patch removes subsystems's files before rebind_subsystems,
if rebind_subsystems failed, repopulate these removed files.

With help from Tejun.
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

7083d037

01 12月, 2012 1 次提交

cgroup: use cgroup_addrm_files() in cgroup_clear_directory() · 879a3d9d

由 Gao feng 提交于 12月 01, 2012

cgroup_clear_directory() incorrectly invokes cgroup_rm_file() on each
cftset of the target subsystems, which only removes the first file of
each set.  This leaves dangling files after subsystems are removed
from a cgroup root via remount.

Use cgroup_addrm_files() to remove all files of target subsystems.

tj: Move cgroup_addrm_files() prototype decl upwards next to other
    global declarations.  Commit message updated.
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

879a3d9d

30 11月, 2012 1 次提交

cgroup: warn about broken hierarchies only after css_online · 1f869e87

由 Glauber Costa 提交于 11月 30, 2012

If everything goes right, it shouldn't really matter if we are spitting
this warning after css_alloc or css_online. If we fail between then,
there are some ill cases where we would previously see the message and
now we won't (like if the files fail to be created).

I believe it really shouldn't matter: this message is intended in spirit
to be shown when creation succeeds, but with insane settings.
Signed-off-by: NGlauber Costa <glommer@parallels.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

1f869e87

29 11月, 2012 2 次提交

cgroup: list_del_init() on removed events · 9718ceb3

由 Greg Thelen 提交于 11月 28, 2012

Use list_del_init() rather than list_del() to remove events from
cgrp->event_list.  No functional change.  This is just defensive
coding.
Signed-off-by: NGreg Thelen <gthelen@google.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

9718ceb3

cgroup: fix lockdep warning for event_control · 205a872b

由 Greg Thelen 提交于 11月 28, 2012

The cgroup_event_wake() function is called with the wait queue head
locked and it takes cgrp->event_list_lock. However, in cgroup_rmdir()
remove_wait_queue() was being called after taking
cgrp->event_list_lock.  Correct the lock ordering by using a temporary
list to obtain the event list to remove from the wait queue.
Signed-off-by: NGreg Thelen <gthelen@google.com>
Signed-off-by: NAaron Durbin <adurbin@google.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

205a872b

28 11月, 2012 1 次提交

cgroup: move list add after list head initilization · fddfb02a

由 Li Zhong 提交于 11月 28, 2012

2243076a ("cgroup: initialize cgrp->allcg_node in
init_cgroup_housekeeping()") initializes cgrp->allcg_node in
init_cgroup_housekeeping().  Then in init_cgroup_root(), we should
call init_cgroup_housekeeping() before adding it to &root->allcg_list;
otherwise, we are initializing an entry already in a list.
Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

fddfb02a

20 11月, 2012 13 次提交

cgroup: remove obsolete guarantee from cgroup_task_migrate. · d0b2fdd2

由 Tao Ma 提交于 11月 20, 2012

'guarantee' is already removed from cgroup_task_migrate, so remove
the corresponding comments. Some other typos in cgroup are also
changed.

Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

d0b2fdd2

cgroup: add cgroup->id · 0a950f65

由 Tejun Heo 提交于 11月 19, 2012

With the introduction of generic cgroup hierarchy iterators, css_id is
being phased out.  It was unnecessarily complex, id'ing the wrong
thing (cgroups need IDs, not CSSes) and has other oddities like not
being available at ->css_alloc().

This patch adds cgroup->id, which is a simple per-hierarchy
ida-allocated ID which is assigned before ->css_alloc() and released
after ->css_free().
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>

0a950f65

cgroup, cpuset: remove cgroup_subsys->post_clone() · 033fa1c5

由 Tejun Heo 提交于 11月 19, 2012

Currently CGRP_CPUSET_CLONE_CHILDREN triggers ->post_clone().  Now
that clone_children is cpuset specific, there's no reason to have this
rather odd option activation mechanism in cgroup core.  cpuset can
check the flag from its ->css_allocate() and take the necessary
action.

Move cpuset_post_clone() logic to the end of cpuset_css_alloc() and
remove cgroup_subsys->post_clone().

Loosely based on Glauber's "generalize post_clone into post_create"
patch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Original-patch-by: NGlauber Costa <glommer@parallels.com>
Original-patch: <1351686554-22592-2-git-send-email-glommer@parallels.com>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Glauber Costa <glommer@parallels.com>

033fa1c5

cgroup: s/CGRP_CLONE_CHILDREN/CGRP_CPUSET_CLONE_CHILDREN/ · 2260e7fc

由 Tejun Heo 提交于 11月 19, 2012

clone_children is only meaningful for cpuset and will stay that way.
Rename the flag to reflect that and update documentation.  Also, drop
clone_children() wrapper in cgroup.c.  The thin wrapper is used only a
few times and one of them will go away soon.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Glauber Costa <glommer@parallels.com>

2260e7fc

cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() · 92fb9748

由 Tejun Heo 提交于 11月 19, 2012

Rename cgroup_subsys css lifetime related callbacks to better describe
what their roles are.  Also, update documentation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

92fb9748

cgroup: allow ->post_create() to fail · b1929db4

由 Tejun Heo 提交于 11月 19, 2012

There could be cases where controllers want to do initialization
operations which may fail from ->post_create().  This patch makes
->post_create() return -errno to indicate failure and online_css()
relay such failures.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Glauber Costa <glommer@parallels.com>

b1929db4

cgroup: update cgroup_create() failure path · 4b8b47eb

由 Tejun Heo 提交于 11月 19, 2012

cgroup_create() was ignoring failure of cgroupfs files.  Update it
such that, if file creation fails, it rolls back by calling
cgroup_destroy_locked() and returns failure.

Note that error out goto labels are renamed.  The labels are a bit
confusing but will become better w/ later cgroup operation renames.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

4b8b47eb

cgroup: use mutex_trylock() when grabbing i_mutex of a new cgroup directory · b8a2df6a

由 Tejun Heo 提交于 11月 19, 2012

All cgroup directory i_mutexes nest outside cgroup_mutex; however, new
directory creation is a special case.  A new cgroup directory is
created while holding cgroup_mutex.  Populating the new directory
requires both the new directory's i_mutex and cgroup_mutex.  Because
all directory i_mutexes nest outside cgroup_mutex, grabbing both
requires releasing cgroup_mutex first, which isn't a good idea as the
new cgroup isn't yet ready to be manipulated by other cgroup
opreations.

This is worked around by grabbing the new directory's i_mutex while
holding cgroup_mutex before making it visible.  As there's no other
user at that point, grabbing the i_mutex under cgroup_mutex can't lead
to deadlock.

cgroup_create_file() was using I_MUTEX_CHILD to tell lockdep not to
worry about the reverse locking order; however, this creates pseudo
locking dependency cgroup_mutex -> I_MUTEX_CHILD, which isn't true -
all directory i_mutexes are still nested outside cgroup_mutex.  This
pseudo locking dependency can lead to spurious lockdep warnings.

Use mutex_trylock() instead.  This will always succeed and lockdep
doesn't create any locking dependency for it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

b8a2df6a

cgroup: simplify cgroup_load_subsys() failure path · d19e19de

由 Tejun Heo 提交于 11月 19, 2012

Now that cgroup_unload_subsys() can tell whether the root css is
online or not, we can safely call cgroup_unload_subsys() after idr
init failure in cgroup_load_subsys().

Replace the manual unrolling and invoke cgroup_unload_subsys() on
failure.  This drops cgroup_mutex inbetween but should be safe as the
subsystem will fail try_module_get() and thus can't be mounted
inbetween.  As this means that cgroup_unload_subsys() can be called
before css_sets are rehashed, remove BUG_ON() on %NULL
css_set->subsys[] from cgroup_unload_subsys().
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

d19e19de

cgroup: introduce CSS_ONLINE flag and on/offline_css() helpers · a31f2d3f

由 Tejun Heo 提交于 11月 19, 2012

New helpers on/offline_css() respectively wrap ->post_create() and
->pre_destroy() invocations.  online_css() sets CSS_ONLINE after
->post_create() is complete and offline_css() invokes ->pre_destroy()
iff CSS_ONLINE is set and clears it while also handling the temporary
dropping of cgroup_mutex.

This patch doesn't introduce any behavior change at the moment but
will be used to improve cgroup_create() failure path and allow
->post_create() to fail.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

a31f2d3f

cgroup: separate out cgroup_destroy_locked() · 42809dd4

由 Tejun Heo 提交于 11月 19, 2012

Separate out cgroup_destroy_locked() from cgroup_destroy().  This will
be later used in cgroup_create() failure path.

While at it, add lockdep asserts on i_mutex and cgroup_mutex, and move
@d and @parent assignments to their declarations.

This patch doesn't introduce any functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

42809dd4

cgroup: fix harmless bugs in cgroup_load_subsys() fail path and cgroup_unload_subsys() · 02ae7486

由 Tejun Heo 提交于 11月 19, 2012

* If idr init fails, cgroup_load_subsys() cleared dummytop->subsys[]
  before calilng ->destroy() making CSS inaccessible to the callback,
  and didn't unlink ss->sibling.  As no modular controller uses
  ->use_id, this doesn't cause any actual problems.

* cgroup_unload_subsys() was forgetting to free idr, call
  ->pre_destroy() and clear ->active.  As there currently is no
  modular controller which uses ->use_id, ->pre_destroy() or ->active,
  this doesn't cause any actual problems.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

02ae7486

cgroup: lock cgroup_mutex in cgroup_init_subsys() · 648bb56d

由 Tejun Heo 提交于 11月 19, 2012

Make cgroup_init_subsys() grab cgroup_mutex while initializing a
subsystem so that all helpers and callbacks are called under the
context they expect.  This isn't strictly necessary as
cgroup_init_subsys() doesn't race with anybody but will allow adding
lockdep assertions.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

648bb56d

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功