提交 · 7e47682ea555e7c1edef1d8fd96e2aa4c12abe59 · openeuler / raspberrypi-kernel

15 7月, 2015 1 次提交

cgroup: allow a cgroup subsystem to reject a fork · 7e47682e

由 Aleksa Sarai 提交于 6月 09, 2015

Add a new cgroup subsystem callback can_fork that conditionally
states whether or not the fork is accepted or rejected by a cgroup
policy. In addition, add a cancel_fork callback so that if an error
occurs later in the forking process, any state modified by can_fork can
be reverted.

Allow for a private opaque pointer to be passed from cgroup_can_fork to
cgroup_post_fork, allowing for the fork state to be stored by each
subsystem separately.

Also add a tagging system for cgroup_subsys.h to allow for CGROUP_<TAG>
enumerations to be be defined and used. In addition, explicitly add a
CGROUP_CANFORK_COUNT macro to make arrays easier to define.

This is in preparation for implementing the pids cgroup subsystem.
Signed-off-by: NAleksa Sarai <cyphar@cyphar.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

7e47682e

02 6月, 2015 1 次提交

cgroup, block: implement task_get_css() and use it in bio_associate_current() · ec438699

由 Tejun Heo 提交于 5月 22, 2015

bio_associate_current() currently open codes task_css() and
css_tryget_online() to find and pin $current's blkcg css.  Abstract it
into task_get_css() which is implemented from cgroup side.  As a task
is always associated with an online css for every subsystem except
while the css_set update is propagating, task_get_css() retries till
css_tryget_online() succeeds.

This is a cleanup and shouldn't lead to noticeable behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ec438699

19 5月, 2015 2 次提交

cgroup: reorganize include/linux/cgroup.h · c326aa2b

由 Tejun Heo 提交于 5月 13, 2015

From c4d440938b5e2015c70594fe6666a099c844f929 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Wed, 13 May 2015 16:21:40 -0400

Over time, cgroup.h grew organically and doesn't have much logical
structure at this point.  Separation of cgroup-defs.h in the previous
patch gives us a good chance for reorganizing cgroup.h as changes to
the header are likely to cause conflicts anyway.

This patch reorganizes cgroup.h so that it has consistent logical
grouping.

This is pure reorganization.

v2: Relocating #ifdef CONFIG_CGROUPS caused build failure when cgroup
    is disabled.  Dropped.
Signed-off-by: NTejun Heo <tj@kernel.org>

c326aa2b

cgroup: separate out include/linux/cgroup-defs.h · b4a04ab7

由 Tejun Heo 提交于 5月 13, 2015

From 2d728f74bfc071df06773e2fd7577dd5dab6425d Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Wed, 13 May 2015 15:37:01 -0400

This patch separates out cgroup-defs.h from cgroup.h which has grown a
lot of dependencies.  cgroup-defs.h currently only contains constant
and type definitions and can be used to break circular include
dependency.  While moving, definitions are reordered so that
cgroup-defs.h has consistent logical structure.

This patch is pure reorganization.
Signed-off-by: NTejun Heo <tj@kernel.org>

b4a04ab7

07 1月, 2015 1 次提交

cgroup: add dummy css_put() for !CONFIG_CGROUPS · f3ba5380

由 Tejun Heo 提交于 1月 06, 2015

This will later be depended upon by the scheduled cgroup writeback
support.
Signed-off-by: NTejun Heo <tj@kernel.org>

f3ba5380

11 12月, 2014 1 次提交

mm: memcontrol: take a css reference for each charged page · e8ea14cc

由 Johannes Weiner 提交于 12月 10, 2014

Charges currently pin the css indirectly by playing tricks during
css_offline(): user pages stall the offlining process until all of them
have been reparented, whereas kmemcg acquires a keep-alive reference if
outstanding kernel pages are detected at that point.

In preparation for removing all this complexity, make the pinning explicit
and acquire a css references for every charged page.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Reviewed-by: NVladimir Davydov <vdavydov@parallels.com>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e8ea14cc

20 11月, 2014 1 次提交
- A
  kill f_dentry uses · b583043e
  由 Al Viro 提交于 10月 31, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  b583043e
18 11月, 2014 3 次提交

cgroup: implement cgroup_get_e_css() · eeecbd19

由 Tejun Heo 提交于 11月 18, 2014

Implement cgroup_get_e_css() which finds and gets the effective css
for the specified cgroup and subsystem combination.  This function
always returns a valid pinned css.  This will be used by cgroup
writeback support.

While at it, add comment to cgroup_e_css() to explain why that
function is different from cgroup_get_e_css() and has to test
cgrp->child_subsys_mask instead of cgroup_css(cgrp, ss).
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NZefan Li <lizefan@huawei.com>

eeecbd19

cgroup: add cgroup_subsys->css_e_css_changed() · 56c807ba

由 Tejun Heo 提交于 11月 18, 2014

Add a new cgroup_subsys operatoin ->css_e_css_changed().  This is
invoked if any of the effective csses seen from the css's cgroup may
have changed.  This will be used to implement cgroup writeback
support.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NZefan Li <lizefan@huawei.com>

56c807ba

cgroup: add cgroup_subsys->css_released() · 7d172cc8

由 Tejun Heo 提交于 11月 18, 2014

Add a new cgroup subsys callback css_released().  This is called when
the reference count of the css (cgroup_subsys_state) reaches zero
before RCU scheduling free.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NZefan Li <lizefan@huawei.com>

7d172cc8

19 9月, 2014 4 次提交

cgroup: remove CGRP_RELEASABLE flag · a25eb52e

由 Zefan Li 提交于 9月 19, 2014

We call put_css_set() after setting CGRP_RELEASABLE flag in
cgroup_task_migrate(), but in other places we call it without setting
the flag. I don't see the necessity of this flag.

Moreover once the flag is set, it will never be cleared, unless writing
to the notify_on_release control file, so it can be quite confusing
if we look at the output of debug.releasable.

  # mount -t cgroup -o debug xxx /cgroup
  # mkdir /cgroup/child
  # cat /cgroup/child/debug.releasable
  0   <-- shows 0 though the cgroup is empty
  # echo $$ > /cgroup/child/tasks
  # cat /cgroup/child/debug.releasable
  0
  # echo $$ > /cgroup/tasks && echo $$ > /cgroup/child/tasks
  # cat /proc/child/debug.releasable
  1   <-- shows 1 though the cgroup is not empty

This patch removes the flag, and now debug.releasable shows if the
cgroup is empty or not.
Signed-off-by: NZefan Li <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

a25eb52e

cgroup: remove redundant check in cgroup_ino() · f29374b1

由 Zefan Li 提交于 9月 19, 2014

After we implemented default unified hierarchy, cgrp->kn can never
be NULL.
Signed-off-by: NZefan Li <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

f29374b1

cgroup: simplify proc_cgroup_show() · 006f4ac4

由 Zefan Li 提交于 9月 18, 2014

Use the ONE macro instead of REG, and we can simplify proc_cgroup_show().
Signed-off-by: NZefan Li <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

006f4ac4

cgroup: use a per-cgroup work for release agent · 971ff493

由 Zefan Li 提交于 9月 18, 2014

Instead of using a global work to schedule release agent on removable
cgroups, we change to use a per-cgroup work to do this, which makes
the code much simpler.

v2: use a dedicated work instead of reusing css->destroy_work. (Tejun)
Signed-off-by: NZefan Li <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

971ff493

18 9月, 2014 1 次提交
- L
  cgroup: remove some useless forward declarations · 6213daab
  由 Li Zefan 提交于 9月 17, 2014
```
Signed-off-by: NZefan Li <lizefan@huawei.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
  6213daab
15 7月, 2014 4 次提交

cgroup: make CFTYPE_ONLY_ON_DFL and CFTYPE_NO_ internal to cgroup core · 05ebb6e6

由 Tejun Heo 提交于 7月 15, 2014

cgroup now distinguishes cftypes for the default and legacy
hierarchies more explicitly by using separate arrays and
CFTYPE_ONLY_ON_DFL and CFTYPE_INSANE should be and are used only
inside cgroup core proper.  Let's make it clear that the flags are
internal by prefixing them with double underscores.

CFTYPE_INSANE is renamed to __CFTYPE_NOT_ON_DFL for consistency.  The
two flags are also collected and assigned bits >= 16 so that they
aren't mixed with the published flags.

v2: Convert the extra ones in cgroup_exit_cftypes() which are added by
    revision to the previous patch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

05ebb6e6

cgroup: distinguish the default and legacy hierarchies when handling cftypes · a8ddc821

由 Tejun Heo 提交于 7月 15, 2014

Until now, cftype arrays carried files for both the default and legacy
hierarchies and the files which needed to be used on only one of them
were flagged with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE.  This
gets confusing very quickly and we may end up exposing interface files
to the default hierarchy without thinking it through.

This patch makes cgroup core provide separate sets of interfaces for
cftype handling so that the cftypes for the default and legacy
hierarchies are clearly distinguished.  The previous two patches
renamed the existing ones so that they clearly indicate that they're
for the legacy hierarchies.  This patch adds the interface for the
default hierarchy and apply them selectively depending on the
hierarchy type.

* cftypes added through cgroup_subsys->dfl_cftypes and
  cgroup_add_dfl_cftypes() only show up on the default hierarchy.

* cftypes added through cgroup_subsys->legacy_cftypes and
  cgroup_add_legacy_cftypes() only show up on the legacy hierarchies.

* cgroup_subsys->dfl_cftypes and ->legacy_cftypes can point to the
  same array for the cases where the interface files are identical on
  both types of hierarchies.

* This makes all the existing subsystem interface files legacy-only by
  default and all subsystems will have no interface file created when
  enabled on the default hierarchy.  Each subsystem should explicitly
  review and compose the interface for the default hierarchy.

* A boot param "cgroup__DEVEL__legacy_files_on_dfl" is added which
  makes subsystems which haven't decided the interface files for the
  default hierarchy to present the legacy files on the default
  hierarchy so that its behavior on the default hierarchy can be
  tested.  As the awkward name suggests, this is for development only.

* memcg's CFTYPE_INSANE on "use_hierarchy" is noop now as the whole
  array isn't used on the default hierarchy.  The flag is removed.

v2: Updated documentation for cgroup__DEVEL__legacy_files_on_dfl.

v3: Clear CFTYPE_ONLY_ON_DFL and CFTYPE_INSANE when cfts are removed
    as suggested by Li.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

a8ddc821

cgroup: replace cgroup_add_cftypes() with cgroup_add_legacy_cftypes() · 2cf669a5

由 Tejun Heo 提交于 7月 15, 2014

Currently, cftypes added by cgroup_add_cftypes() are used for both the
unified default hierarchy and legacy ones and subsystems can mark each
file with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to
appear only on one of them.  This is quite hairy and error-prone.
Also, we may end up exposing interface files to the default hierarchy
without thinking it through.

cgroup_subsys will grow two separate cftype addition functions and
apply each only on the hierarchies of the matching type.  This will
allow organizing cftypes in a lot clearer way and encourage subsystems
to scrutinize the interface which is being exposed in the new default
hierarchy.

In preparation, this patch adds cgroup_add_legacy_cftypes() which
currently is a simple wrapper around cgroup_add_cftypes() and replaces
all cgroup_add_cftypes() usages with it.

While at it, this patch drops a completely spurious return from
__hugetlb_cgroup_file_init().

This patch doesn't introduce any functional differences.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

2cf669a5

cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes · 5577964e

由 Tejun Heo 提交于 7月 15, 2014

Currently, cgroup_subsys->base_cftypes is used for both the unified
default hierarchy and legacy ones and subsystems can mark each file
with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear
only on one of them.  This is quite hairy and error-prone.  Also, we
may end up exposing interface files to the default hierarchy without
thinking it through.

cgroup_subsys will grow two separate cftype arrays and apply each only
on the hierarchies of the matching type.  This will allow organizing
cftypes in a lot clearer way and encourage subsystems to scrutinize
the interface which is being exposed in the new default hierarchy.

In preparation, this patch renames cgroup_subsys->base_cftypes to
cgroup_subsys->legacy_cftypes.  This patch is pure rename.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

5577964e

09 7月, 2014 6 次提交

cgroup: remove sane_behavior support on non-default hierarchies · aa6ec29b

由 Tejun Heo 提交于 7月 09, 2014

sane_behavior has been used as a development vehicle for the default
unified hierarchy.  Now that the default hierarchy is in place, the
flag became redundant and confusing as its usage is allowed on all
hierarchies.  There are gonna be either the default hierarchy or
legacy ones.  Let's make that clear by removing sane_behavior support
on non-default hierarchies.

This patch replaces cgroup_sane_behavior() with cgroup_on_dfl().  The
comment on top of CGRP_ROOT_SANE_BEHAVIOR is moved to on top of
cgroup_on_dfl() with sane_behavior specific part dropped.

On the default and legacy hierarchies w/o sane_behavior, this
shouldn't cause any behavior differences.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>

aa6ec29b

cgroup: remove CGRP_ROOT_OPTION_MASK · 7450e90b

由 Tejun Heo 提交于 7月 09, 2014

cgroup_root->flags only contains CGRP_ROOT_* flags and there's no
reason to mask the flags.  Remove CGRP_ROOT_OPTION_MASK.

This doesn't cause any behavior differences.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

7450e90b

cgroup: implement cgroup_subsys->depends_on · af0ba678

由 Tejun Heo 提交于 7月 08, 2014

Currently, the blkio subsystem attributes all of writeback IOs to the
root.  One of the issues is that there's no way to tell who originated
a writeback IO from block layer.  Those IOs are usually issued
asynchronously from a task which didn't have anything to do with
actually generating the dirty pages.  The memory subsystem, when
enabled, already keeps track of the ownership of each dirty page and
it's desirable for blkio to piggyback instead of adding its own
per-page tag.

blkio piggybacking on memory is an implementation detail which
preferably should be handled automatically without requiring explicit
userland action.  To achieve that, this patch implements
cgroup_subsys->depends_on which contains the mask of subsystems which
should be enabled together when the subsystem is enabled.

The previous patches already implemented the support for enabled but
invisible subsystems and cgroup_subsys->depends_on can be easily
implemented by updating cgroup_refresh_child_subsys_mask() so that it
calculates cgroup->child_subsys_mask considering
cgroup_subsys->depends_on of the explicitly enabled subsystems.

Documentation/cgroups/unified-hierarchy.txt is updated to explain that
subsystems may not become immediately available after being unused
from userland and that dependency could be a factor in it.  As
subsystems may already keep residual references, this doesn't
significantly change how subsystem rebinding can be used.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>

af0ba678

cgroup: implement cgroup_subsys->css_reset() · b4536f0c

由 Tejun Heo 提交于 7月 08, 2014

cgroup is implementing support for subsystem dependency which would
require a way to enable a subsystem even when it's not directly
configured through "cgroup.subtree_control".

The previous patches added support for explicitly and implicitly
enabled subsystems and showing/hiding their interface files.  An
explicitly enabled subsystem may become implicitly enabled if it's
turned off through "cgroup.subtree_control" but there are subsystems
depending on it.  In such cases, the subsystem, as it's turned off
when seen from userland, shouldn't enforce any resource control.
Also, the subsystem may be explicitly turned on later again and its
interface files should be as close to the intial state as possible.

This patch adds cgroup_subsys->css_reset() which is invoked when a css
is hidden.  The callback should disable resource control and reset the
state to the vanilla state.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>

b4536f0c

cgroup: make interface files visible iff enabled on cgroup->subtree_control · f63070d3

由 Tejun Heo 提交于 7月 08, 2014

cgroup is implementing support for subsystem dependency which would
require a way to enable a subsystem even when it's not directly
configured through "cgroup.subtree_control".

The preceding patch distinguished cgroup->subtree_control and
->child_subsys_mask where the former is the subsystems explicitly
configured by the userland and the latter is all enabled subsystems
currently is equal to the former but will include subsystems
implicitly enabled through dependency.

Subsystems which are enabled due to dependency shouldn't be visible to
userland.  This patch updates cgroup_subtree_control_write() and
create_css() such that interface files are not created for implicitly
enabled subsytems.

* @visible paramter is added to create_css().  Interface files are
  created only when true.

* If an already implicitly enabled subsystem is turned on through
  "cgroup.subtree_control", the existing css should be used.  css
  draining is skipped.

* cgroup_subtree_control_write() computes the new target
  cgroup->child_subsys_mask and create/kill or show/hide csses
  accordingly.

As the two subsystem masks are still kept identical, this patch
doesn't introduce any behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>

f63070d3

cgroup: introduce cgroup->subtree_control · 667c2491

由 Tejun Heo 提交于 7月 08, 2014

cgroup is implementing support for subsystem dependency which would
require a way to enable a subsystem even when it's not directly
configured through "cgroup.subtree_control".

Previously, cgroup->child_subsys_mask directly reflected
"cgroup.subtree_control" and the enabled subsystems in the child
cgroups.  This patch adds cgroup->subtree_control which
"cgroup.subtree_control" operates on.  cgroup->child_subsys_mask is
now calculated from cgroup->subtree_control by
cgroup_refresh_child_subsys_mask(), which sets it identical to
cgroup->subtree_control for now.

This will allow using cgroup->child_subsys_mask for all the enabled
subsystems including the implicit ones and ->subtree_control for
tracking the explicitly requested ones.  This patch keeps the two
masks identical and doesn't introduce any behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>

667c2491

20 5月, 2014 1 次提交

cgroup: disallow debug controller on the default hierarchy · 5533e011

由 Tejun Heo 提交于 5月 14, 2014

The debug controller, as its name suggests, exposes cgroup core
internals to userland to aid debugging.  Unfortunately, except for the
name, there's no provision to prevent its usage in production
configurations and the controller is widely enabled and mounted
leaking internal details to userland.  Like most other debug
information, the information exposed by debug isn't interesting even
for debugging itself once the related parts are working reliably.

This controller has no reason for existing.  This patch implements
cgrp_dfl_root_inhibit_ss_mask which can suppress specific subsystems
on the default hierarchy and adds the debug subsystem to it so that it
can be gradually deprecated as usages move towards the unified
hierarchy.
Signed-off-by: NTejun Heo <tj@kernel.org>

5533e011

17 5月, 2014 10 次提交

cgroup: implement css_tryget() · 6f4524d3

由 Tejun Heo 提交于 5月 16, 2014

Implement css_tryget() which tries to grab a cgroup_subsys_state's
reference as long as it already hasn't reached zero.  Combined with
the recent css iterator changes to include offline && !released csses
during traversal, this can be used to access csses regardless of its
online state.

v2: Take the new flag CSS_NO_REF into account.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>

6f4524d3

cgroup: convert cgroup_has_live_children() into css_has_online_children() · f3d46500

由 Tejun Heo 提交于 5月 16, 2014

Now that cgroup liveliness and css onliness are the same state,
convert cgroup_has_live_children() into css_has_online_children() so
that it can be used for actual csses too.  The function now uses
css_for_each_child() for iteration and is published.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

f3d46500

cgroup: use CSS_ONLINE instead of CGRP_DEAD · 184faf32

由 Tejun Heo 提交于 5月 16, 2014

Use CSS_ONLINE on the self css to indicate whether a cgroup has been
killed instead of CGRP_DEAD.  This will allow re-using css online test
for cgroup liveliness test.  This doesn't introduce any functional
change.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

184faf32

cgroup: iterate cgroup_subsys_states directly · c2931b70

由 Tejun Heo 提交于 5月 16, 2014

Currently, css_next_child() is implemented as finding the next child
cgroup which has the css enabled, which used to be the only way to do
it as only cgroups participated in sibling lists and thus could be
iteratd.  This works as long as what's required during iteration is
not missing online csses; however, it turns out that there are use
cases where offlined but not yet released csses need to be iterated.
This is difficult to implement through cgroup iteration the unified
hierarchy as there may be multiple dying csses for the same subsystem
associated with single cgroup.

After the recent changes, the cgroup self and regular csses behave
identically in how they're linked and unlinked from the sibling lists
including assertion of CSS_RELEASED and css_next_child() can simply
switch to iterating csses directly.  This both simplifies the logic
and ensures that all visible non-released csses are included in the
iteration whether there are multiple dying csses for a subsystem or
not.

As all other iterators depend on css_next_child() for sibling
iteration, this changes behaviors of all css iterators.  Add and
update explanations on the css states which are included in traversal
to all iterators.

As css iteration could always contain offlined csses, this shouldn't
break any of the current users and new usages which need iteration of
all on and offline csses can make use of the new semantics.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>

c2931b70

cgroup: introduce CSS_RELEASED and reduce css iteration fallback window · de3f0341

由 Tejun Heo 提交于 5月 16, 2014

css iterations allow the caller to drop RCU read lock.  As long as the
caller keeps the current position accessible, it can simply re-grab
RCU read lock later and continue iteration.  This is achieved by using
CGRP_DEAD to detect whether the current positions next pointer is safe
to dereference and if not re-iterate from the beginning to the next
position using ->serial_nr.

CGRP_DEAD is used as the marker to invalidate the next pointer and the
only requirement is that the marker is set before the next sibling
starts its RCU grace period.  Because CGRP_DEAD is set at the end of
cgroup_destroy_locked() but the cgroup is unlinked when the reference
count reaches zero, we currently have a rather large window where this
fallback re-iteration logic can be triggered.

This patch introduces CSS_RELEASED which is set when a css is unlinked
from its sibling list.  This still keeps the re-iteration logic
working while drastically reducing the window of its activation.
While at it, rewrite the comment in css_next_child() to reflect the
new flag and better explain the synchronization.

This will also enable iterating csses directly instead of through
cgroups.

v2: CSS_RELEASED now assigned to 1 << 2 as 1 << 0 is used by
    CSS_NO_REF.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

de3f0341

cgroup: move cgroup->serial_nr into cgroup_subsys_state · 0cb51d71

由 Tejun Heo 提交于 5月 16, 2014

We're moving towards using cgroup_subsys_states as the fundamental
structural blocks.  All csses including the cgroup->self and actual
ones now form trees through css->children and ->sibling which follow
the same rules as what cgroup->children and ->sibling followed.  This
patch moves cgroup->serial_nr which is used to implement css iteration
into css.

Note that all csses, regardless of their types, allocate their serial
numbers from the same monotonically increasing counter.  This doesn't
affect the ordering needed by css iteration or cause any other
material behavior changes.  This will be used to update css iteration.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

0cb51d71

cgroup: move cgroup->sibling and ->children into cgroup_subsys_state · d5c419b6

由 Tejun Heo 提交于 5月 16, 2014

We're moving towards using cgroup_subsys_states as the fundamental
structural blocks.  Let's move cgroup->sibling and ->children into
cgroup_subsys_state.  This is pure move without functional change and
only cgroup->self's fields are actually used.  Other csses will make
use of the fields later.

While at it, update init_and_link_css() so that it zeroes the whole
css before initializing it and remove explicit zeroing of ->flags.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

d5c419b6

cgroup: remove cgroup->parent · d51f39b0

由 Tejun Heo 提交于 5月 16, 2014

cgroup->parent is redundant as cgroup->self.parent can also be used to
determine the parent cgroup and we're moving towards using
cgroup_subsys_states as the fundamental structural blocks.  This patch
introduces cgroup_parent() which follows cgroup->self.parent and
removes cgroup->parent.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

d51f39b0

cgroup: remove css_parent() · 5c9d535b

由 Tejun Heo 提交于 5月 16, 2014

cgroup in general is moving towards using cgroup_subsys_state as the
fundamental structural component and css_parent() was introduced to
convert from using cgroup->parent to css->parent.  It was quite some
time ago and we're moving forward with making css more prominent.

This patch drops the trivial wrapper css_parent() and let the users
dereference css->parent.  While at it, explicitly mark fields of css
which are public and immutable.

v2: New usage from device_cgroup.c converted.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: N"David S. Miller" <davem@davemloft.net>
Acked-by: NLi Zefan <lizefan@huawei.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Johannes Weiner <hannes@cmpxchg.org>

5c9d535b

cgroup: skip refcnting on normal root csses and cgrp_dfl_root self css · 3b514d24

由 Tejun Heo 提交于 5月 16, 2014

9395a450 ("cgroup: enable refcnting for root csses") enabled
reference counting for root csses (cgroup_subsys_states) so that
cgroup's self csses can be used to manage the lifetime of the
containing cgroups.

Unfortunately, this change was incorrect.  During early init,
cgrp_dfl_root self css refcnt is used.  percpu_ref can't initialized
during early init and its initialization is deferred till
cgroup_init() time.  This means that cpu was using percpu_ref which
wasn't properly initialized.  Due to the way percpu variables are laid
out on x86, this didn't blow up immediately on x86 but ended up
incrementing and decrementing the percpu variable at offset zero,
whatever it may be; however, on other archs, this caused fault and
early boot failure.

As cgroup self csses for root cgroups of non-dfl hierarchies need
working refcounting, we can't revert 9395a450.  This patch adds
CSS_NO_REF which explicitly inhibits reference counting on the css and
sets it on all normal (non-self) csses and cgroup_dfl_root self css.

v2: cgrp_dfl_root.self is the offending one.  Set the flag on it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NStephen Warren <swarren@nvidia.com>
Tested-by: NStephen Warren <swarren@nvidia.com>
Fixes: 9395a450 ("cgroup: enable refcnting for root csses")

3b514d24

14 5月, 2014 4 次提交

cgroup: use cgroup->self.refcnt for cgroup refcnting · 9d755d33

由 Tejun Heo 提交于 5月 14, 2014

Currently cgroup implements refcnting separately using atomic_t
cgroup->refcnt.  The destruction paths of cgroup and css are rather
complex and bear a lot of similiarities including the use of RCU and
bouncing to a work item.

This patch makes cgroup use the refcnt of self css for refcnting
instead of using its own.  This makes cgroup refcnting use css's
percpu refcnt and share the destruction mechanism.

* css_release_work_fn() and css_free_work_fn() are updated to handle
  both csses and cgroups.  This is a bit messy but should do until we
  can make cgroup->self a full css, which currently can't be done
  thanks to multiple hierarchies.

* cgroup_destroy_locked() now performs
  percpu_ref_kill(&cgrp->self.refcnt) instead of cgroup_put(cgrp).

* Negative refcnt sanity check in cgroup_get() is no longer necessary
  as percpu_ref already handles it.

* Similarly, as a cgroup which hasn't been killed will never be
  released regardless of its refcnt value and percpu_ref has sanity
  check on kill, cgroup_is_dead() sanity check in cgroup_put() is no
  longer necessary.

* As whether a refcnt reached zero or not can only be decided after
  the reference count is killed, cgroup_root->cgrp's refcnting can no
  longer be used to decide whether to kill the root or not.  Let's
  make cgroup_kill_sb() explicitly initiate destruction if the root
  doesn't have any children.  This makes sense anyway as unmounted
  cgroup hierarchy without any children should be destroyed.

While this is a bit messy, this will allow pushing more bookkeeping
towards cgroup->self and thus handling cgroups and csses in more
uniform way.  In the very long term, it should be possible to
introduce a base subsystem and convert the self css to a proper one
making things whole lot simpler and unified.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

9d755d33

cgroup: enable refcnting for root csses · 9395a450

由 Tejun Heo 提交于 5月 14, 2014

Currently, css_get(), css_tryget() and css_tryget_online() are noops
for root csses as an optimization; however, we're planning to use css
refcnts to track of cgroup lifetime too and root cgroups also need to
be reference counted.  Since css has been converted to percpu_refcnt,
the overhead of refcnting is miniscule and this optimization isn't too
meaningful anymore.  Furthermore, controllers which optimize the root
cgroup often never even invoke these functions in their hot paths.

This patch enables refcnting for root csses too.  This makes CSS_ROOT
flag unused and removes it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

9395a450

cgroup: remove cgroup_destory_css_killed() · 249f3468

由 Tejun Heo 提交于 5月 14, 2014

cgroup_destroy_css_killed() is cgroup destruction stage which happens
after all csses are offlined.  After the recent updates, it no longer
does anything other than putting the base reference.  This patch
removes the function and makes cgroup_destroy_locked() put the base
ref at the end isntead.

This also makes cgroup->nr_css unnecessary.  Removed.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

249f3468

cgroup: rename cgroup->dummy_css to ->self and move it to the top · 9d800df1

由 Tejun Heo 提交于 5月 14, 2014

cgroup->dummy_css is used as the placeholder css when performing css
oriended operations on the cgroup.  We're gonna shift more cgroup
management to this css.  Let's rename it to ->self and move it to the
top.

This is pure rename and field relocation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>

9d800df1