1. 11 12月, 2014 1 次提交
  2. 20 11月, 2014 1 次提交
  3. 18 11月, 2014 3 次提交
  4. 19 9月, 2014 4 次提交
  5. 18 9月, 2014 1 次提交
  6. 15 7月, 2014 4 次提交
    • T
      cgroup: make CFTYPE_ONLY_ON_DFL and CFTYPE_NO_ internal to cgroup core · 05ebb6e6
      Tejun Heo 提交于
      cgroup now distinguishes cftypes for the default and legacy
      hierarchies more explicitly by using separate arrays and
      CFTYPE_ONLY_ON_DFL and CFTYPE_INSANE should be and are used only
      inside cgroup core proper.  Let's make it clear that the flags are
      internal by prefixing them with double underscores.
      
      CFTYPE_INSANE is renamed to __CFTYPE_NOT_ON_DFL for consistency.  The
      two flags are also collected and assigned bits >= 16 so that they
      aren't mixed with the published flags.
      
      v2: Convert the extra ones in cgroup_exit_cftypes() which are added by
          revision to the previous patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      05ebb6e6
    • T
      cgroup: distinguish the default and legacy hierarchies when handling cftypes · a8ddc821
      Tejun Heo 提交于
      Until now, cftype arrays carried files for both the default and legacy
      hierarchies and the files which needed to be used on only one of them
      were flagged with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE.  This
      gets confusing very quickly and we may end up exposing interface files
      to the default hierarchy without thinking it through.
      
      This patch makes cgroup core provide separate sets of interfaces for
      cftype handling so that the cftypes for the default and legacy
      hierarchies are clearly distinguished.  The previous two patches
      renamed the existing ones so that they clearly indicate that they're
      for the legacy hierarchies.  This patch adds the interface for the
      default hierarchy and apply them selectively depending on the
      hierarchy type.
      
      * cftypes added through cgroup_subsys->dfl_cftypes and
        cgroup_add_dfl_cftypes() only show up on the default hierarchy.
      
      * cftypes added through cgroup_subsys->legacy_cftypes and
        cgroup_add_legacy_cftypes() only show up on the legacy hierarchies.
      
      * cgroup_subsys->dfl_cftypes and ->legacy_cftypes can point to the
        same array for the cases where the interface files are identical on
        both types of hierarchies.
      
      * This makes all the existing subsystem interface files legacy-only by
        default and all subsystems will have no interface file created when
        enabled on the default hierarchy.  Each subsystem should explicitly
        review and compose the interface for the default hierarchy.
      
      * A boot param "cgroup__DEVEL__legacy_files_on_dfl" is added which
        makes subsystems which haven't decided the interface files for the
        default hierarchy to present the legacy files on the default
        hierarchy so that its behavior on the default hierarchy can be
        tested.  As the awkward name suggests, this is for development only.
      
      * memcg's CFTYPE_INSANE on "use_hierarchy" is noop now as the whole
        array isn't used on the default hierarchy.  The flag is removed.
      
      v2: Updated documentation for cgroup__DEVEL__legacy_files_on_dfl.
      
      v3: Clear CFTYPE_ONLY_ON_DFL and CFTYPE_INSANE when cfts are removed
          as suggested by Li.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      a8ddc821
    • T
      cgroup: replace cgroup_add_cftypes() with cgroup_add_legacy_cftypes() · 2cf669a5
      Tejun Heo 提交于
      Currently, cftypes added by cgroup_add_cftypes() are used for both the
      unified default hierarchy and legacy ones and subsystems can mark each
      file with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to
      appear only on one of them.  This is quite hairy and error-prone.
      Also, we may end up exposing interface files to the default hierarchy
      without thinking it through.
      
      cgroup_subsys will grow two separate cftype addition functions and
      apply each only on the hierarchies of the matching type.  This will
      allow organizing cftypes in a lot clearer way and encourage subsystems
      to scrutinize the interface which is being exposed in the new default
      hierarchy.
      
      In preparation, this patch adds cgroup_add_legacy_cftypes() which
      currently is a simple wrapper around cgroup_add_cftypes() and replaces
      all cgroup_add_cftypes() usages with it.
      
      While at it, this patch drops a completely spurious return from
      __hugetlb_cgroup_file_init().
      
      This patch doesn't introduce any functional differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      2cf669a5
    • T
      cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes · 5577964e
      Tejun Heo 提交于
      Currently, cgroup_subsys->base_cftypes is used for both the unified
      default hierarchy and legacy ones and subsystems can mark each file
      with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear
      only on one of them.  This is quite hairy and error-prone.  Also, we
      may end up exposing interface files to the default hierarchy without
      thinking it through.
      
      cgroup_subsys will grow two separate cftype arrays and apply each only
      on the hierarchies of the matching type.  This will allow organizing
      cftypes in a lot clearer way and encourage subsystems to scrutinize
      the interface which is being exposed in the new default hierarchy.
      
      In preparation, this patch renames cgroup_subsys->base_cftypes to
      cgroup_subsys->legacy_cftypes.  This patch is pure rename.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      5577964e
  7. 09 7月, 2014 6 次提交
    • T
      cgroup: remove sane_behavior support on non-default hierarchies · aa6ec29b
      Tejun Heo 提交于
      sane_behavior has been used as a development vehicle for the default
      unified hierarchy.  Now that the default hierarchy is in place, the
      flag became redundant and confusing as its usage is allowed on all
      hierarchies.  There are gonna be either the default hierarchy or
      legacy ones.  Let's make that clear by removing sane_behavior support
      on non-default hierarchies.
      
      This patch replaces cgroup_sane_behavior() with cgroup_on_dfl().  The
      comment on top of CGRP_ROOT_SANE_BEHAVIOR is moved to on top of
      cgroup_on_dfl() with sane_behavior specific part dropped.
      
      On the default and legacy hierarchies w/o sane_behavior, this
      shouldn't cause any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      aa6ec29b
    • T
      cgroup: remove CGRP_ROOT_OPTION_MASK · 7450e90b
      Tejun Heo 提交于
      cgroup_root->flags only contains CGRP_ROOT_* flags and there's no
      reason to mask the flags.  Remove CGRP_ROOT_OPTION_MASK.
      
      This doesn't cause any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      7450e90b
    • T
      cgroup: implement cgroup_subsys->depends_on · af0ba678
      Tejun Heo 提交于
      Currently, the blkio subsystem attributes all of writeback IOs to the
      root.  One of the issues is that there's no way to tell who originated
      a writeback IO from block layer.  Those IOs are usually issued
      asynchronously from a task which didn't have anything to do with
      actually generating the dirty pages.  The memory subsystem, when
      enabled, already keeps track of the ownership of each dirty page and
      it's desirable for blkio to piggyback instead of adding its own
      per-page tag.
      
      blkio piggybacking on memory is an implementation detail which
      preferably should be handled automatically without requiring explicit
      userland action.  To achieve that, this patch implements
      cgroup_subsys->depends_on which contains the mask of subsystems which
      should be enabled together when the subsystem is enabled.
      
      The previous patches already implemented the support for enabled but
      invisible subsystems and cgroup_subsys->depends_on can be easily
      implemented by updating cgroup_refresh_child_subsys_mask() so that it
      calculates cgroup->child_subsys_mask considering
      cgroup_subsys->depends_on of the explicitly enabled subsystems.
      
      Documentation/cgroups/unified-hierarchy.txt is updated to explain that
      subsystems may not become immediately available after being unused
      from userland and that dependency could be a factor in it.  As
      subsystems may already keep residual references, this doesn't
      significantly change how subsystem rebinding can be used.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      af0ba678
    • T
      cgroup: implement cgroup_subsys->css_reset() · b4536f0c
      Tejun Heo 提交于
      cgroup is implementing support for subsystem dependency which would
      require a way to enable a subsystem even when it's not directly
      configured through "cgroup.subtree_control".
      
      The previous patches added support for explicitly and implicitly
      enabled subsystems and showing/hiding their interface files.  An
      explicitly enabled subsystem may become implicitly enabled if it's
      turned off through "cgroup.subtree_control" but there are subsystems
      depending on it.  In such cases, the subsystem, as it's turned off
      when seen from userland, shouldn't enforce any resource control.
      Also, the subsystem may be explicitly turned on later again and its
      interface files should be as close to the intial state as possible.
      
      This patch adds cgroup_subsys->css_reset() which is invoked when a css
      is hidden.  The callback should disable resource control and reset the
      state to the vanilla state.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      b4536f0c
    • T
      cgroup: make interface files visible iff enabled on cgroup->subtree_control · f63070d3
      Tejun Heo 提交于
      cgroup is implementing support for subsystem dependency which would
      require a way to enable a subsystem even when it's not directly
      configured through "cgroup.subtree_control".
      
      The preceding patch distinguished cgroup->subtree_control and
      ->child_subsys_mask where the former is the subsystems explicitly
      configured by the userland and the latter is all enabled subsystems
      currently is equal to the former but will include subsystems
      implicitly enabled through dependency.
      
      Subsystems which are enabled due to dependency shouldn't be visible to
      userland.  This patch updates cgroup_subtree_control_write() and
      create_css() such that interface files are not created for implicitly
      enabled subsytems.
      
      * @visible paramter is added to create_css().  Interface files are
        created only when true.
      
      * If an already implicitly enabled subsystem is turned on through
        "cgroup.subtree_control", the existing css should be used.  css
        draining is skipped.
      
      * cgroup_subtree_control_write() computes the new target
        cgroup->child_subsys_mask and create/kill or show/hide csses
        accordingly.
      
      As the two subsystem masks are still kept identical, this patch
      doesn't introduce any behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      f63070d3
    • T
      cgroup: introduce cgroup->subtree_control · 667c2491
      Tejun Heo 提交于
      cgroup is implementing support for subsystem dependency which would
      require a way to enable a subsystem even when it's not directly
      configured through "cgroup.subtree_control".
      
      Previously, cgroup->child_subsys_mask directly reflected
      "cgroup.subtree_control" and the enabled subsystems in the child
      cgroups.  This patch adds cgroup->subtree_control which
      "cgroup.subtree_control" operates on.  cgroup->child_subsys_mask is
      now calculated from cgroup->subtree_control by
      cgroup_refresh_child_subsys_mask(), which sets it identical to
      cgroup->subtree_control for now.
      
      This will allow using cgroup->child_subsys_mask for all the enabled
      subsystems including the implicit ones and ->subtree_control for
      tracking the explicitly requested ones.  This patch keeps the two
      masks identical and doesn't introduce any behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      667c2491
  8. 20 5月, 2014 1 次提交
    • T
      cgroup: disallow debug controller on the default hierarchy · 5533e011
      Tejun Heo 提交于
      The debug controller, as its name suggests, exposes cgroup core
      internals to userland to aid debugging.  Unfortunately, except for the
      name, there's no provision to prevent its usage in production
      configurations and the controller is widely enabled and mounted
      leaking internal details to userland.  Like most other debug
      information, the information exposed by debug isn't interesting even
      for debugging itself once the related parts are working reliably.
      
      This controller has no reason for existing.  This patch implements
      cgrp_dfl_root_inhibit_ss_mask which can suppress specific subsystems
      on the default hierarchy and adds the debug subsystem to it so that it
      can be gradually deprecated as usages move towards the unified
      hierarchy.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5533e011
  9. 17 5月, 2014 10 次提交
    • T
      cgroup: implement css_tryget() · 6f4524d3
      Tejun Heo 提交于
      Implement css_tryget() which tries to grab a cgroup_subsys_state's
      reference as long as it already hasn't reached zero.  Combined with
      the recent css iterator changes to include offline && !released csses
      during traversal, this can be used to access csses regardless of its
      online state.
      
      v2: Take the new flag CSS_NO_REF into account.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      6f4524d3
    • T
      cgroup: convert cgroup_has_live_children() into css_has_online_children() · f3d46500
      Tejun Heo 提交于
      Now that cgroup liveliness and css onliness are the same state,
      convert cgroup_has_live_children() into css_has_online_children() so
      that it can be used for actual csses too.  The function now uses
      css_for_each_child() for iteration and is published.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      f3d46500
    • T
      cgroup: use CSS_ONLINE instead of CGRP_DEAD · 184faf32
      Tejun Heo 提交于
      Use CSS_ONLINE on the self css to indicate whether a cgroup has been
      killed instead of CGRP_DEAD.  This will allow re-using css online test
      for cgroup liveliness test.  This doesn't introduce any functional
      change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      184faf32
    • T
      cgroup: iterate cgroup_subsys_states directly · c2931b70
      Tejun Heo 提交于
      Currently, css_next_child() is implemented as finding the next child
      cgroup which has the css enabled, which used to be the only way to do
      it as only cgroups participated in sibling lists and thus could be
      iteratd.  This works as long as what's required during iteration is
      not missing online csses; however, it turns out that there are use
      cases where offlined but not yet released csses need to be iterated.
      This is difficult to implement through cgroup iteration the unified
      hierarchy as there may be multiple dying csses for the same subsystem
      associated with single cgroup.
      
      After the recent changes, the cgroup self and regular csses behave
      identically in how they're linked and unlinked from the sibling lists
      including assertion of CSS_RELEASED and css_next_child() can simply
      switch to iterating csses directly.  This both simplifies the logic
      and ensures that all visible non-released csses are included in the
      iteration whether there are multiple dying csses for a subsystem or
      not.
      
      As all other iterators depend on css_next_child() for sibling
      iteration, this changes behaviors of all css iterators.  Add and
      update explanations on the css states which are included in traversal
      to all iterators.
      
      As css iteration could always contain offlined csses, this shouldn't
      break any of the current users and new usages which need iteration of
      all on and offline csses can make use of the new semantics.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      c2931b70
    • T
      cgroup: introduce CSS_RELEASED and reduce css iteration fallback window · de3f0341
      Tejun Heo 提交于
      css iterations allow the caller to drop RCU read lock.  As long as the
      caller keeps the current position accessible, it can simply re-grab
      RCU read lock later and continue iteration.  This is achieved by using
      CGRP_DEAD to detect whether the current positions next pointer is safe
      to dereference and if not re-iterate from the beginning to the next
      position using ->serial_nr.
      
      CGRP_DEAD is used as the marker to invalidate the next pointer and the
      only requirement is that the marker is set before the next sibling
      starts its RCU grace period.  Because CGRP_DEAD is set at the end of
      cgroup_destroy_locked() but the cgroup is unlinked when the reference
      count reaches zero, we currently have a rather large window where this
      fallback re-iteration logic can be triggered.
      
      This patch introduces CSS_RELEASED which is set when a css is unlinked
      from its sibling list.  This still keeps the re-iteration logic
      working while drastically reducing the window of its activation.
      While at it, rewrite the comment in css_next_child() to reflect the
      new flag and better explain the synchronization.
      
      This will also enable iterating csses directly instead of through
      cgroups.
      
      v2: CSS_RELEASED now assigned to 1 << 2 as 1 << 0 is used by
          CSS_NO_REF.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      de3f0341
    • T
      cgroup: move cgroup->serial_nr into cgroup_subsys_state · 0cb51d71
      Tejun Heo 提交于
      We're moving towards using cgroup_subsys_states as the fundamental
      structural blocks.  All csses including the cgroup->self and actual
      ones now form trees through css->children and ->sibling which follow
      the same rules as what cgroup->children and ->sibling followed.  This
      patch moves cgroup->serial_nr which is used to implement css iteration
      into css.
      
      Note that all csses, regardless of their types, allocate their serial
      numbers from the same monotonically increasing counter.  This doesn't
      affect the ordering needed by css iteration or cause any other
      material behavior changes.  This will be used to update css iteration.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      0cb51d71
    • T
      cgroup: move cgroup->sibling and ->children into cgroup_subsys_state · d5c419b6
      Tejun Heo 提交于
      We're moving towards using cgroup_subsys_states as the fundamental
      structural blocks.  Let's move cgroup->sibling and ->children into
      cgroup_subsys_state.  This is pure move without functional change and
      only cgroup->self's fields are actually used.  Other csses will make
      use of the fields later.
      
      While at it, update init_and_link_css() so that it zeroes the whole
      css before initializing it and remove explicit zeroing of ->flags.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      d5c419b6
    • T
      cgroup: remove cgroup->parent · d51f39b0
      Tejun Heo 提交于
      cgroup->parent is redundant as cgroup->self.parent can also be used to
      determine the parent cgroup and we're moving towards using
      cgroup_subsys_states as the fundamental structural blocks.  This patch
      introduces cgroup_parent() which follows cgroup->self.parent and
      removes cgroup->parent.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      d51f39b0
    • T
      cgroup: remove css_parent() · 5c9d535b
      Tejun Heo 提交于
      cgroup in general is moving towards using cgroup_subsys_state as the
      fundamental structural component and css_parent() was introduced to
      convert from using cgroup->parent to css->parent.  It was quite some
      time ago and we're moving forward with making css more prominent.
      
      This patch drops the trivial wrapper css_parent() and let the users
      dereference css->parent.  While at it, explicitly mark fields of css
      which are public and immutable.
      
      v2: New usage from device_cgroup.c converted.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      5c9d535b
    • T
      cgroup: skip refcnting on normal root csses and cgrp_dfl_root self css · 3b514d24
      Tejun Heo 提交于
      9395a450 ("cgroup: enable refcnting for root csses") enabled
      reference counting for root csses (cgroup_subsys_states) so that
      cgroup's self csses can be used to manage the lifetime of the
      containing cgroups.
      
      Unfortunately, this change was incorrect.  During early init,
      cgrp_dfl_root self css refcnt is used.  percpu_ref can't initialized
      during early init and its initialization is deferred till
      cgroup_init() time.  This means that cpu was using percpu_ref which
      wasn't properly initialized.  Due to the way percpu variables are laid
      out on x86, this didn't blow up immediately on x86 but ended up
      incrementing and decrementing the percpu variable at offset zero,
      whatever it may be; however, on other archs, this caused fault and
      early boot failure.
      
      As cgroup self csses for root cgroups of non-dfl hierarchies need
      working refcounting, we can't revert 9395a450.  This patch adds
      CSS_NO_REF which explicitly inhibits reference counting on the css and
      sets it on all normal (non-self) csses and cgroup_dfl_root self css.
      
      v2: cgrp_dfl_root.self is the offending one.  Set the flag on it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Warren <swarren@nvidia.com>
      Tested-by: NStephen Warren <swarren@nvidia.com>
      Fixes: 9395a450 ("cgroup: enable refcnting for root csses")
      3b514d24
  10. 14 5月, 2014 9 次提交
    • T
      cgroup: use cgroup->self.refcnt for cgroup refcnting · 9d755d33
      Tejun Heo 提交于
      Currently cgroup implements refcnting separately using atomic_t
      cgroup->refcnt.  The destruction paths of cgroup and css are rather
      complex and bear a lot of similiarities including the use of RCU and
      bouncing to a work item.
      
      This patch makes cgroup use the refcnt of self css for refcnting
      instead of using its own.  This makes cgroup refcnting use css's
      percpu refcnt and share the destruction mechanism.
      
      * css_release_work_fn() and css_free_work_fn() are updated to handle
        both csses and cgroups.  This is a bit messy but should do until we
        can make cgroup->self a full css, which currently can't be done
        thanks to multiple hierarchies.
      
      * cgroup_destroy_locked() now performs
        percpu_ref_kill(&cgrp->self.refcnt) instead of cgroup_put(cgrp).
      
      * Negative refcnt sanity check in cgroup_get() is no longer necessary
        as percpu_ref already handles it.
      
      * Similarly, as a cgroup which hasn't been killed will never be
        released regardless of its refcnt value and percpu_ref has sanity
        check on kill, cgroup_is_dead() sanity check in cgroup_put() is no
        longer necessary.
      
      * As whether a refcnt reached zero or not can only be decided after
        the reference count is killed, cgroup_root->cgrp's refcnting can no
        longer be used to decide whether to kill the root or not.  Let's
        make cgroup_kill_sb() explicitly initiate destruction if the root
        doesn't have any children.  This makes sense anyway as unmounted
        cgroup hierarchy without any children should be destroyed.
      
      While this is a bit messy, this will allow pushing more bookkeeping
      towards cgroup->self and thus handling cgroups and csses in more
      uniform way.  In the very long term, it should be possible to
      introduce a base subsystem and convert the self css to a proper one
      making things whole lot simpler and unified.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      9d755d33
    • T
      cgroup: enable refcnting for root csses · 9395a450
      Tejun Heo 提交于
      Currently, css_get(), css_tryget() and css_tryget_online() are noops
      for root csses as an optimization; however, we're planning to use css
      refcnts to track of cgroup lifetime too and root cgroups also need to
      be reference counted.  Since css has been converted to percpu_refcnt,
      the overhead of refcnting is miniscule and this optimization isn't too
      meaningful anymore.  Furthermore, controllers which optimize the root
      cgroup often never even invoke these functions in their hot paths.
      
      This patch enables refcnting for root csses too.  This makes CSS_ROOT
      flag unused and removes it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      9395a450
    • T
      cgroup: remove cgroup_destory_css_killed() · 249f3468
      Tejun Heo 提交于
      cgroup_destroy_css_killed() is cgroup destruction stage which happens
      after all csses are offlined.  After the recent updates, it no longer
      does anything other than putting the base reference.  This patch
      removes the function and makes cgroup_destroy_locked() put the base
      ref at the end isntead.
      
      This also makes cgroup->nr_css unnecessary.  Removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      249f3468
    • T
      cgroup: rename cgroup->dummy_css to ->self and move it to the top · 9d800df1
      Tejun Heo 提交于
      cgroup->dummy_css is used as the placeholder css when performing css
      oriended operations on the cgroup.  We're gonna shift more cgroup
      management to this css.  Let's rename it to ->self and move it to the
      top.
      
      This is pure rename and field relocation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      9d800df1
    • T
      cgroup: remove cgroup->control_kn · b7fc5ad2
      Tejun Heo 提交于
      Now that cgroup_subtree_control_write() has access to the associated
      kernfs_open_file and thus the kernfs_node, there's no need to cache it
      in cgroup->control_kn on creation.  Remove cgroup->control_kn and use
      @of->kn directly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      b7fc5ad2
    • T
      cgroup: replace cftype->trigger() with cftype->write() · 6770c64e
      Tejun Heo 提交于
      cftype->trigger() is pointless.  It's trivial to ignore the input
      buffer from a regular ->write() operation.  Convert all ->trigger()
      users to ->write() and remove ->trigger().
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      6770c64e
    • T
      cgroup: replace cftype->write_string() with cftype->write() · 451af504
      Tejun Heo 提交于
      Convert all cftype->write_string() users to the new cftype->write()
      which maps directly to kernfs write operation and has full access to
      kernfs and cgroup contexts.  The conversions are mostly mechanical.
      
      * @css and @cft are accessed using of_css() and of_cft() accessors
        respectively instead of being specified as arguments.
      
      * Should return @nbytes on success instead of 0.
      
      * @buf is not trimmed automatically.  Trim if necessary.  Note that
        blkcg and netprio don't need this as the parsers already handle
        whitespaces.
      
      cftype->write_string() has no user left after the conversions and
      removed.
      
      While at it, remove unnecessary local variable @p in
      cgroup_subtree_control_write() and stale comment about
      CGROUP_LOCAL_BUFFER_SIZE in cgroup_freezer.c.
      
      This patch doesn't introduce any visible behavior changes.
      
      v2: netprio was missing from conversion.  Converted.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NAristeu Rozanski <arozansk@redhat.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      451af504
    • T
      cgroup: implement cftype->write() · b4168640
      Tejun Heo 提交于
      During the recent conversion to kernfs, cftype's seq_file operations
      are updated so that they are directly mapped to kernfs operations and
      thus can fully access the associated kernfs and cgroup contexts;
      however, write path hasn't seen similar updates and none of the
      existing write operations has access to, for example, the associated
      kernfs_open_file.
      
      Let's introduce a new operation cftype->write() which maps directly to
      the kernfs write operation and has access to all the arguments and
      contexts.  This will replace ->write_string() and ->trigger() and ease
      manipulation of kernfs active protection from cgroup file operations.
      
      Two accessors - of_cft() and of_css() - are introduced to enable
      accessing the associated cgroup context from cftype->write() which
      only takes kernfs_open_file for the context information.  The
      accessors for seq_file operations - seq_cft() and seq_css() - are
      rewritten to wrap the of_ accessors.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      b4168640
    • T
      cgroup: rename css_tryget*() to css_tryget_online*() · ec903c0c
      Tejun Heo 提交于
      Unlike the more usual refcnting, what css_tryget() provides is the
      distinction between online and offline csses instead of protection
      against upping a refcnt which already reached zero.  cgroup is
      planning to provide actual tryget which fails if the refcnt already
      reached zero.  Let's rename the existing trygets so that they clearly
      indicate that they're onliness.
      
      I thought about keeping the existing names as-are and introducing new
      names for the planned actual tryget; however, given that each
      controller participates in the synchronization of the online state, it
      seems worthwhile to make it explicit that these functions are about
      on/offline state.
      
      Rename css_tryget() to css_tryget_online() and css_tryget_from_dir()
      to css_tryget_online_from_dir().  This is pure rename.
      
      v2: cgroup_freezer grew new usages of css_tryget().  Update
          accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      ec903c0c