提交 · 712317ad97f41e738e1a19aa0a6392a78a84094e · openeuler / raspberrypi-kernel

19 4月, 2013 2 次提交

cgroup: fix broken file xattrs · 712317ad

由 Li Zefan 提交于 4月 18, 2013

We should store file xattrs in struct cfent instead of struct cftype,
because cftype is a type while cfent is object instance of cftype.

For example each cgroup has a tasks file, and each tasks file is
associated with a uniq cfent, but all those files share the same
struct cftype.

Alexey Kodanev reported a crash, which can be reproduced:

  # mount -t cgroup -o xattr /sys/fs/cgroup
  # mkdir /sys/fs/cgroup/test
  # setfattr -n trusted.value -v test_value /sys/fs/cgroup/tasks
  # rmdir /sys/fs/cgroup/test
  # umount /sys/fs/cgroup
  oops!

In this case, simple_xattrs_free() will free the same struct simple_xattrs
twice.

tj: Dropped unused local variable @cft from cgroup_diput().

Cc: <stable@vger.kernel.org> # 3.8.x
Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

712317ad

devcg: remove parent_cgroup. · e57d5cf2

由 Rami Rosen 提交于 4月 16, 2013

In devcgroup_css_alloc(), there is no longer need for parent_cgroup.
bd2953eb("devcg: propagate local changes down the hierarchy") made
the variable parent_cgroup redundant. This patch removes parent_cgroup
from devcgroup_css_alloc().
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Acked-by: NAristeu Rozanski <aris@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

e57d5cf2

16 4月, 2013 1 次提交

memcg: force use_hierarchy if sane_behavior · f00baae7

由 Tejun Heo 提交于 4月 15, 2013

Turn on use_hierarchy by default if sane_behavior is specified and
don't create .use_hierarchy file.

It is debatable whether to remove .use_hierarchy file or make it ro as
the former could make transition easier in certain cases; however, the
behavior changes which will be gated by sane_behavior are intensive
including changing basic meaning of certain control knobs in a few
controllers and I don't really think keeping this piece would make
things easier in any noticeable way, so let's remove it.

v2: Explain that mem_cgroup_bind() doesn't have to worry about
    children as suggested by Michal Hocko.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

f00baae7

15 4月, 2013 5 次提交

cgroup: remove cgrp->top_cgroup · 05fb22ec

由 Li Zefan 提交于 4月 15, 2013

It's not used, and it can be retrieved via cgrp->root->top_cgroup.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

05fb22ec

cgroup: introduce sane_behavior mount option · 873fe09e

由 Tejun Heo 提交于 4月 14, 2013

It's a sad fact that at this point various cgroup controllers are
carrying so many idiosyncrasies and pure insanities that it simply
isn't possible to reach any sort of sane consistent behavior while
maintaining staying fully compatible with what already has been
exposed to userland.

As we can't break exposed userland interface, transitioning to sane
behaviors can only be done in steps while maintaining backwards
compatibility.  This patch introduces a new mount option -
__DEVEL__sane_behavior - which disables crazy features and enforces
consistent behaviors in cgroup core proper and various controllers.
As exactly which behaviors it changes are still being determined, the
mount option, at this point, is useful only for development of the new
behaviors.  As such, the mount option is prefixed with __DEVEL__ and
generates a warning message when used.

Eventually, once we get to the point where all controller's behaviors
are consistent enough to implement unified hierarchy, the __DEVEL__
prefix will be dropped, and more importantly, unified-hierarchy will
enforce sane_behavior by default.  Maybe we'll able to completely drop
the crazy stuff after a while, maybe not, but we at least have a
strategy to move on to saner behaviors.

This patch introduces the mount option and changes the following
behaviors in cgroup core.

* Mount options "noprefix" and "clone_children" are disallowed.  Also,
  cgroupfs file cgroup.clone_children is not created.

* When mounting an existing superblock, mount options should match.
  This is currently pretty crazy.  If one mounts a cgroup, creates a
  subdirectory, unmounts it and then mount it again with different
  option, it looks like the new options are applied but they aren't.

* Remount is disallowed.

The behaviors changes are documented in the comment above
CGRP_ROOT_SANE_BEHAVIOR enum and will be expanded as different
controllers are converted and planned improvements progress.

v2: Dropped unnecessary explicit file permission setting sane_behavior
    cftype entry as suggested by Li Zefan.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vivek Goyal <vgoyal@redhat.com>

873fe09e

move cgroupfs_root to include/linux/cgroup.h · 25a7e684

由 Tejun Heo 提交于 4月 14, 2013

While controllers shouldn't be accessing cgroupfs_root directly, it
being hidden inside kern/cgroup.c makes somethings pretty silly.  This
makes routing hierarchy-wide settings which need to be visible to
controllers cumbersome.

We're gonna add another hierarchy-wide setting which needs to be
accessed from controllers.  Move cgroupfs_root and its flags to the
header file so that we can access root settings with inline helpers.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NLi Zefan <lizefan@huawei.com>

25a7e684

cgroup: convert cgroupfs_root flag bits to masks and add CGRP_ prefix · 93438629

由 Tejun Heo 提交于 4月 14, 2013

There's no reason to be using bitops, which tends to be more
cumbersome, to handle root flags.  Convert them to masks.  Also, as
they'll be moved to include/linux/cgroup.h and it's generally a good
idea, add CGRP_ prefix.

Note that flags are assigned from (1 << 1).  The first bit will be
used by a flag which will be added soon.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NLi Zefan <lizefan@huawei.com>

93438629

cgroup: make cgroup_path() not print double slashes · da1f296f

由 Tejun Heo 提交于 4月 14, 2013

While reimplementing cgroup_path(), 65dff759 ("cgroup: fix
cgroup_path() vs rename() race") introduced a bug where the path of a
non-root cgroup would have two slahses at the beginning, which is
caused by treating the root cgroup which has the name '/' like
non-root cgroups.

 $ grep systemd /proc/self/cgroup
 1:name=systemd://user/root/1

Fix it by special casing root cgroup case and not looping over it in
the normal path.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>

da1f296f

13 4月, 2013 1 次提交

Revert "cgroup: remove bind() method from cgroup_subsys." · 26d5bbe5

由 Tejun Heo 提交于 4月 12, 2013

This reverts commit 84cfb6ab.  There
are scheduled changes which make use of the removed callback.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Rami Rosen <ramirose@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>

26d5bbe5

11 4月, 2013 4 次提交

perf: make perf_event cgroup hierarchical · ef824fa1

由 Tejun Heo 提交于 4月 08, 2013

perf_event is one of a couple remaining cgroup controllers with broken
hierarchy support.  Converting it to support hierarchy is almost
trivial.  The only thing necessary is to consider a task belonging to
a descendant cgroup as a match.  IOW, if the cgroup of the currently
executing task (@cpuctx->cgrp) equals or is a descendant of the
event's cgroup (@event->cgrp), then the event should be enabled.

Implement hierarchy support and remove .broken_hierarchy tag along
with the incorrect comment on what needs to be done for hierarchy
support.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>

ef824fa1

cgroup: implement cgroup_is_descendant() · 78574cf9

由 Li Zefan 提交于 4月 08, 2013

A couple controllers want to determine whether two cgroups are in
ancestor/descendant relationship.  As it's more likely that the
descendant is the primary subject of interest and there are other
operations focusing on the descendants, let's ask is_descendent rather
than is_ancestor.

Implementation is trivial as the previous patch guarantees that all
ancestors of a cgroup stay accessible as long as the cgroup is
accessible.

tj: Removed depth optimization, renamed from cgroup_is_ancestor(),
    rewrote descriptions.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

78574cf9

cgroup: make sure parent won't be destroyed before its children · 415cf07a

由 Li Zefan 提交于 4月 08, 2013

Suppose we rmdir a cgroup and there're still css refs, this cgroup won't
be freed. Then we rmdir the parent cgroup, and the parent is freed
immediately due to css ref draining to 0. Now it would be a disaster if
the still-alive child cgroup tries to access its parent.

Make sure this won't happen.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Reviewed-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

415cf07a

cgroup: remove bind() method from cgroup_subsys. · 84cfb6ab

由 Rami Rosen 提交于 4月 10, 2013

The bind() method of cgroup_subsys is not used in any of the
controllers (cpuset, freezer, blkio, net_cls, memcg, net_prio,
devices, perf, hugetlb, cpu and cpuacct)

tj: Removed the entry on ->bind() from
    Documentation/cgroups/cgroups.txt.  Also updated a couple
    paragraphs which were suggesting that dynamic re-binding may be
    implemented.  It's not gonna.
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

84cfb6ab

08 4月, 2013 7 次提交

devcg: remove broken_hierarchy tag · 8adf12b0

由 Tejun Heo 提交于 4月 07, 2013

bd2953eb ("devcg: propagate local changes down the hierarchy")
implemented proper hierarchy support.  Remove the broken tag.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NAristeu Rozanski <aris@redhat.com>

8adf12b0

cgroup: remove cgroup_lock_is_held() · 2219449a

由 Tejun Heo 提交于 4月 07, 2013

We don't want controllers to assume that the information is officially
available and do funky things with it.

The only user is task_subsys_state_check() which uses it to verify RCU
access context.  We can move cgroup_lock_is_held() inside
CONFIG_PROVE_RCU but that doesn't add meaningful protection compared
to conditionally exposing cgroup_mutex.

Remove cgroup_lock_is_held(), export cgroup_mutex iff CONFIG_PROVE_RCU
and use lockdep_is_held() directly on the mutex in
task_subsys_state_check().

While at it, add parentheses around macro arguments in
task_subsys_state_check().
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

2219449a

cgroup: kill cgroup_[un]lock() · 47cfcd09

由 Tejun Heo 提交于 4月 07, 2013

Now that locking interface is unexported, there's no reason to keep
around these thin wrappers.  Kill them and use mutex operations
directly.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

47cfcd09

cgroup: unexport locking interface and cgroup_attach_task() · b9777cf8

由 Tejun Heo 提交于 4月 07, 2013

Now that all external cgroup_lock() users are gone, we can finally
unexport the locking interface and prevent future abuse of
cgroup_mutex.

Make cgroup_[un]lock() and cgroup_lock_live_group() static.  Also,
cgroup_attach_task() doesn't have any user left and can't be used
without locking interface anyway.  Make it static too.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

b9777cf8

cgroup: relocate cgroup_lock_live_group() and cgroup_attach_task_all() · 7ae1bad9

由 Tejun Heo 提交于 4月 07, 2013

cgroup_lock_live_group() and cgroup_attach_task() are scheduled to be
made static.  Relocate the former and cgroup_attach_task_all() so that
we don't need forward declarations.

This patch is pure relocation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

7ae1bad9

cgroup, cpuset: replace move_member_tasks_to_cpuset() with cgroup_transfer_tasks() · 8cc99345

由 Tejun Heo 提交于 4月 07, 2013

When a cpuset becomes empty (no CPU or memory), its tasks are
transferred with the nearest ancestor with execution resources.  This
is implemented using cgroup_scan_tasks() with a callback which grabs
cgroup_mutex and invokes cgroup_attach_task() on each task.

Both cgroup_mutex and cgroup_attach_task() are scheduled to be
unexported.  Implement cgroup_transfer_tasks() in cgroup proper which
is essentially the same as move_member_tasks_to_cpuset() except that
it takes cgroups instead of cpusets and @to comes before @from like
normal functions with those arguments, and replace
move_member_tasks_to_cpuset() with it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

8cc99345

memcg: fix memcg_cache_name() to use cgroup_name() · d9c10ddd

由 Michal Hocko 提交于 3月 28, 2013

As cgroup supports rename, it's unsafe to dereference dentry->d_name
without proper vfs locks. Fix this by using cgroup_name() rather than
dentry directly.

Also open code memcg_cache_name because it is called only from
kmem_cache_dup which frees the returned name right after
kmem_cache_create_memcg makes a copy of it. Such a short-lived
allocation doesn't make too much sense. So replace it by a static
buffer as kmem_cache_dup is called with memcg_cache_mutex.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NGlauber Costa <glommer@parallels.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

d9c10ddd

04 4月, 2013 2 次提交

cgroup: remove unused parameter in cgroup_task_migrate(). · 1e2ccd1c

由 Kevin Wilson 提交于 4月 01, 2013

This patch removes unused parameter from cgroup_task_migrate().
Signed-off-by: NKevin Wilson <wkevils@gmail.com>
Acked-by: NAcked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

1e2ccd1c

cgroups: Documentation/cgroup/cgroup.txt - a trivial fix. · 1ae65ae9

由 Rami Rosen 提交于 4月 03, 2013

This trivial patch removes a word which appears twice in
Documentation/cgroup/cgroup.txt.
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

1ae65ae9

20 3月, 2013 6 次提交

cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc() · 081aa458

由 Li Zefan 提交于 3月 13, 2013

These two functions share most of the code.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

081aa458

devcg: propagate local changes down the hierarchy · bd2953eb

由 Aristeu Rozanski 提交于 2月 15, 2013

This patch makes exception changes to propagate down in hierarchy respecting
when possible local exceptions.

New exceptions allowing additional access to devices won't be propagated, but
it'll be possible to add an exception to access all of part of the newly
allowed device(s).

New exceptions disallowing access to devices will be propagated down and the
local group's exceptions will be revalidated for the new situation.
Example:
      A
     / \
        B

    group        behavior          exceptions
    A            allow             "b 8:* rwm", "c 116:1 rw"
    B            deny              "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"

If a new exception is added to group A:
	# echo "c 116:* r" > A/devices.deny
it'll propagate down and after revalidating B's local exceptions, the exception
"c 116:2 rwm" will be removed.

In case parent's exceptions change and local exceptions are not allowed anymore,
they'll be deleted.

v7:
- do not allow behavior change when the cgroup has children
- update documentation

v6: fixed issues pointed by Serge Hallyn
- only copy parent's exceptions while propagating behavior if the local
  behavior is different
- while propagating exceptions, do not clear and copy parent's: it'd be against
  the premise we don't propagate access to more devices

v5: fixed issues pointed by Serge Hallyn
- updated documentation
- not propagating when an exception is written to devices.allow
- when propagating a new behavior, clean the local exceptions list if they're
  for a different behavior

v4: fixed issues pointed by Tejun Heo
- separated function to walk the tree and collect valid propagation targets

v3: fixed issues pointed by Tejun Heo
- update documentation
- move css_online/css_offline changes to a new patch
- use cgroup_for_each_descendant_pre() instead of own descendant walk
- move exception_copy rework to a separared patch
- move exception_clean rework to a separated patch

v2: fixed issues pointed by Tejun Heo
- instead of keeping the local settings that won't apply anymore, remove them

Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NAristeu Rozanski <aris@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

bd2953eb

devcg: use css_online and css_offline · 1909554c

由 Aristeu Rozanski 提交于 2月 15, 2013

Allocate resources and change behavior only when online. This is needed in
order to determine if a node is suitable for hierarchy propagation or if it's
being removed.

Locking:
Both functions take devcgroup_mutex to make changes to device_cgroup structure.
Hierarchy propagation will also take devcgroup_mutex before walking the
tree while walking the tree itself is protected by rcu lock.
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NAristeu Rozanski <aris@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

1909554c

devcg: prepare may_access() for hierarchy support · c39a2a30

由 Aristeu Rozanski 提交于 2月 15, 2013

Currently may_access() is only able to verify if an exception is valid for the
current cgroup, which has the same behavior. With hierarchy, it'll be also used
to verify if a cgroup local exception is valid towards its cgroup parent, which
might have different behavior.

v2:
- updated patch description
- rebased on top of a new patch to expand the may_access() logic to make it
  more clear
- fixed argument description order in may_access()
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NAristeu Rozanski <aris@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

c39a2a30

devcg: expand may_access() logic · 26898fdf

由 Aristeu Rozanski 提交于 2月 15, 2013

In order to make the next patch more clear, expand may_access() logic.

v2: may_access() returns bool now
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NAristeu Rozanski <aris@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

26898fdf

cgroup: fix an off-by-one bug which may trigger BUG_ON() · 3ac1707a

由 Li Zefan 提交于 3月 12, 2013

The 3rd parameter of flex_array_prealloc() is the number of elements,
not the index of the last element.

The effect of the bug is, when opening cgroup.procs, a flex array will
be allocated and all elements of the array is allocated with
GFP_KERNEL flag, but the last one is GFP_ATOMIC, and if we fail to
allocate memory for it, it'll trigger a BUG_ON().
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org

3ac1707a

13 3月, 2013 6 次提交

cgroup: remove useless code in cgroup_write_event_control() · 80f36c2a

由 Li Zefan 提交于 3月 12, 2013

eventfd_poll() never returns POLLHUP.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

80f36c2a

cgroup: don't bother to resize pid array · 6ee211ad

由 Li Zefan 提交于 3月 12, 2013

When we open cgroup.procs, we'll allocate an buffer and store all tasks'
tgid in it, and then duplicate entries will be stripped. If that results
in a much smaller pid list, we'll re-allocate a smaller buffer.

But we've already sucessfully allocated memory and reading the procs
file is a short period and the memory will be freed very soon, so why
bother to re-allocate memory.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

6ee211ad

cgroup: hold cgroup_mutex before calling css_offline() · d7eeac19

由 Li Zefan 提交于 3月 12, 2013

cpuset no longer nests cgroup_mutex inside cpu_hotplug lock, so
we don't have to release cgroup_mutex before calling css_offline().
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

d7eeac19

L
cgroup: remove unused variables in cgroup_destroy_locked() · 6dc01181
由 Li Zefan 提交于 3月 12, 2013
```
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
6dc01181

cgroup: remove cgroup_is_descendant() · e7b2dcc5

由 Li Zefan 提交于 3月 12, 2013

It was used by ns cgroup, and ns cgroup was removed long ago.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

e7b2dcc5

cpuset: fix RCU lockdep splat in cpuset_print_task_mems_allowed() · cfb5966b

由 Li Zefan 提交于 3月 12, 2013

Sasha reported a lockdep warning when OOM was triggered. The reason
is cgroup_name() should be called with rcu_read_lock() held.
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

cfb5966b

06 3月, 2013 3 次提交

cpuset: remove include of cgroup.h from cpuset.h · ff794dea

由 Li Zefan 提交于 3月 05, 2013

We don't need to include cgroup.h in cpuset.h.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

ff794dea

res_counter: remove include of cgroup.h from res_counter.h · 9259826c

由 Li Zefan 提交于 3月 05, 2013

It's not needed at all.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

9259826c

cgroup: avoid accessing modular cgroup subsys structure without locking · 7d8e0bf5

由 Li Zefan 提交于 3月 05, 2013

subsys[i] is set to NULL in cgroup_unload_subsys() at modular unload,
and that's protected by cgroup_mutex, and then the memory *subsys[i]
resides will be freed.

So this is unsafe without any locking:

  if (!ss || ss->module)
  ...

v2:
- add a comment for enum cgroup_subsys_id
- simplify the comment in cgroup_exit()
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

7d8e0bf5

05 3月, 2013 3 次提交

cgroup: no need to check css refs for release notification · f50daa70

由 Li Zefan 提交于 3月 01, 2013

We no longer fail rmdir() when there're still css refs, so we don't
need to check css refs in check_for_release().

This also voids a bug. cgroup_has_css_refs() accesses subsys[i]
without cgroup_mutex, so it can race with cgroup_unload_subsys().

cgroup_has_css_refs()
...
  if (ss == NULL || ss->root != cgrp->root)

if ss pointers to net_cls_subsys, and cls_cgroup module is unloaded
right after the former check but before the latter, the memory that
net_cls_subsys resides has become invalid.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

f50daa70

cpuset: use cgroup_name() in cpuset_print_task_mems_allowed() · f440d98f

由 Li Zefan 提交于 3月 01, 2013

Use cgroup_name() instead of cgrp->dentry->name. This makes the code
a bit simpler.

While at it, remove cpuset_name and make cpuset_nodelist a local variable
to cpuset_print_task_mems_allowed().
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

f440d98f

cgroup: fix cgroup_path() vs rename() race · 65dff759

由 Li Zefan 提交于 3月 01, 2013

rename() will change dentry->d_name. The result of this race can
be worse than seeing partially rewritten name, but we might access
a stale pointer because rename() will re-allocate memory to hold
a longer name.

As accessing dentry->name must be protected by dentry->d_lock or
parent inode's i_mutex, while on the other hand cgroup-path() can
be called with some irq-safe spinlocks held, we can't generate
cgroup path using dentry->d_name.

Alternatively we make a copy of dentry->d_name and save it in
cgrp->name when a cgroup is created, and update cgrp->name at
rename().

v5: use flexible array instead of zero-size array.
v4: - allocate root_cgroup_name and all root_cgroup->name points to it.
    - add cgroup_name() wrapper.
v3: use kfree_rcu() instead of synchronize_rcu() in user-visible path.
v2: make cgrp->name RCU safe.
Signed-off-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

65dff759