1. 13 8月, 2013 4 次提交
    • T
      cgroup: cgroup_css_from_dir() now should be called with RCU read locked · b77d7b60
      Tejun Heo 提交于
      cgroup->subsys[] will become RCU protected and thus all cgroup_css()
      usages should either be under RCU read lock or cgroup_mutex.  This
      patch updates cgroup_css_from_dir() which returns the matching
      cgroup_subsys_state given a directory file and subsys_id so that it
      requires RCU read lock and updates its sole user
      perf_cgroup_connect().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      b77d7b60
    • T
      cgroup: add cgroup_subsys_state->parent · 0ae78e0b
      Tejun Heo 提交于
      With the planned unified hierarchy, css's (cgroup_subsys_state) will
      be RCU protected and allowed to be attached and detached dynamically
      over the course of a cgroup's lifetime.  This means that css's will
      stay accessible after being detached from its cgroup - the matching
      pointer in cgroup->subsys[] cleared - for ref draining and RCU grace
      period.
      
      cgroup core still wants to guarantee that the parent css is never
      destroyed before its children and css_parent() always returns the
      parent regardless of the state of the child css as long as it's
      accessible.
      
      This patch makes css's hold onto their parents and adds css->parent so
      that the parent css is never detroyed before its children and can be
      determined without consulting the cgroups.
      
      cgroup->dummy_css is also updated to point to the parent dummy_css;
      however, it doesn't need to worry about object lifetime as the parent
      cgroup is already pinned by the child.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      0ae78e0b
    • T
      cgroup: rename cgroup_subsys_state->dput_work and its callback function · 35ef10da
      Tejun Heo 提交于
      css (cgroup_subsys_state) will become RCU protected and there will be
      two stages which require punting to work item during release.  To
      prepare for using the work item for multiple times, rename
      css->dput_work to css->destroy_work and css_dput_fn() to
      css_free_work_fn() and move work item initialization from css init to
      right before the actual usage.
      
      This reorganization doesn't introduce any behavior change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      35ef10da
    • T
      cgroup: always use cgroup_css() · 40e93b39
      Tejun Heo 提交于
      cgroup_css() is the accessor for cgroup->subsys[] but is not used
      consistently.  cgroup->subsys[] will become RCU protected and
      cgroup_css() will grow synchronization sanity checks.  In preparation,
      make all cgroup->subsys[] dereferences use cgroup_css() consistently.
      
      This patch doesn't introduce any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      40e93b39
  2. 09 8月, 2013 19 次提交
    • T
      cgroup: make css_for_each_descendant() and friends include the origin css in the iteration · bd8815a6
      Tejun Heo 提交于
      Previously, all css descendant iterators didn't include the origin
      (root of subtree) css in the iteration.  The reasons were maintaining
      consistency with css_for_each_child() and that at the time of
      introduction more use cases needed skipping the origin anyway;
      however, given that css_is_descendant() considers self to be a
      descendant, omitting the origin css has become more confusing and
      looking at the accumulated use cases rather clearly indicates that
      including origin would result in simpler code overall.
      
      While this is a change which can easily lead to subtle bugs, cgroup
      API including the iterators has recently gone through major
      restructuring and no out-of-tree changes will be applicable without
      adjustments making this a relatively acceptable opportunity for this
      type of change.
      
      The conversions are mostly straight-forward.  If the iteration block
      had explicit origin handling before or after, it's moved inside the
      iteration.  If not, if (pos == origin) continue; is added.  Some
      conversions add extra reference get/put around origin handling by
      consolidating origin handling and the rest.  While the extra ref
      operations aren't strictly necessary, this shouldn't cause any
      noticeable difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      bd8815a6
    • T
      cgroup: unexport cgroup_css() · 95109b62
      Tejun Heo 提交于
      cgroup_css() no longer has any user left outside cgroup.c proper and
      we don't want subsystems to grow new usages of the function.  cgroup
      core should always provide the css to use to the subsystems, which
      will make dynamic creation and destruction of css's across the
      lifetime of a cgroup much more manageable than exposing the cgroup
      directly to subsystems and let them dereference css's from it.
      
      Make cgroup_css() a static function in cgroup.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      95109b62
    • T
      cgroup: make cgroup_taskset deal with cgroup_subsys_state instead of cgroup · d99c8727
      Tejun Heo 提交于
      cgroup is in the process of converting to css (cgroup_subsys_state)
      from cgroup as the principal subsystem interface handle.  This is
      mostly to prepare for the unified hierarchy support where css's will
      be created and destroyed dynamically but also helps cleaning up
      subsystem implementations as css is usually what they are interested
      in anyway.
      
      cgroup_taskset which is used by the subsystem attach methods is the
      last cgroup subsystem API which isn't using css as the handle.  Update
      cgroup_taskset_cur_cgroup() to cgroup_taskset_cur_css() and
      cgroup_taskset_for_each() to take @skip_css instead of @skip_cgrp.
      
      The conversions are pretty mechanical.  One exception is
      cpuset::cgroup_cs(), which lost its last user and got removed.
      
      This patch shouldn't introduce any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      d99c8727
    • T
      cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state instead of cgroup · 81eeaf04
      Tejun Heo 提交于
      cgroup is in the process of converting to css (cgroup_subsys_state)
      from cgroup as the principal subsystem interface handle.  This is
      mostly to prepare for the unified hierarchy support where css's will
      be created and destroyed dynamically but also helps cleaning up
      subsystem implementations as css is usually what they are interested
      in anyway.
      
      cftype->[un]register_event() is among the remaining couple interfaces
      which still use struct cgroup.  Convert it to cgroup_subsys_state.
      The conversion is mostly mechanical and removes the last users of
      mem_cgroup_from_cont() and cg_to_vmpressure(), which are removed.
      
      v2: indentation update as suggested by Li Zefan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      81eeaf04
    • T
      cgroup: make task iterators deal with cgroup_subsys_state instead of cgroup · 72ec7029
      Tejun Heo 提交于
      cgroup is in the process of converting to css (cgroup_subsys_state)
      from cgroup as the principal subsystem interface handle.  This is
      mostly to prepare for the unified hierarchy support where css's will
      be created and destroyed dynamically but also helps cleaning up
      subsystem implementations as css is usually what they are interested
      in anyway.
      
      This patch converts task iterators to deal with css instead of cgroup.
      Note that under unified hierarchy, different sets of tasks will be
      considered belonging to a given cgroup depending on the subsystem in
      question and making the iterators deal with css instead cgroup
      provides them with enough information about the iteration.
      
      While at it, fix several function comment formats in cpuset.c.
      
      This patch doesn't introduce any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      72ec7029
    • T
      cgroup: remove struct cgroup_scanner · e535837b
      Tejun Heo 提交于
      cgroup_scan_tasks() takes a pointer to struct cgroup_scanner as its
      sole argument and the only function of that struct is packing the
      arguments of the function call which are consisted of five fields.
      It's not too unusual to pack parameters into a struct when the number
      of arguments gets excessive or the whole set needs to be passed around
      a lot, but neither holds here making it just weird.
      
      Drop struct cgroup_scanner and pass the params directly to
      cgroup_scan_tasks().  Note that struct cpuset_change_nodemask_arg was
      added to cpuset.c to pass both ->cs and ->newmems pointer to
      cpuset_change_nodemask() using single data pointer.
      
      This doesn't make any functional differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      e535837b
    • T
      cgroup: make cgroup_task_iter remember the cgroup being iterated · c59cd3d8
      Tejun Heo 提交于
      Currently all cgroup_task_iter functions require @cgrp to be passed
      in, which is superflous and increases chance of usage error.  Make
      cgroup_task_iter remember the cgroup being iterated and drop @cgrp
      argument from next and end functions.
      
      This patch doesn't introduce any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      c59cd3d8
    • T
      cgroup: rename cgroup_iter to cgroup_task_iter · 0942eeee
      Tejun Heo 提交于
      cgroup now has multiple iterators and it's quite confusing to have
      something which walks over tasks of a single cgroup named cgroup_iter.
      Let's rename it to cgroup_task_iter.
      
      While at it, reformat / update comments and replace the overview
      comment above the interface function decls with proper function
      comments.  Such overview can be useful but function comments should be
      more than enough here.
      
      This is pure rename and doesn't introduce any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      0942eeee
    • T
      cgroup: relocate cgroup_advance_iter() · d515876e
      Tejun Heo 提交于
      For some reason, cgroup_advance_iter() is standing lonely all away
      from its iter comrades.  Relocate it.
      
      This is cosmetic.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      d515876e
    • T
      cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup · 492eb21b
      Tejun Heo 提交于
      cgroup is currently in the process of transitioning to using css
      (cgroup_subsys_state) as the primary handle instead of cgroup in
      subsystem API.  For hierarchy iterators, this is beneficial because
      
      * In most cases, css is the only thing subsystems care about anyway.
      
      * On the planned unified hierarchy, iterations for different
        subsystems will need to skip over different subtrees of the
        hierarchy depending on which subsystems are enabled on each cgroup.
        Passing around css makes it unnecessary to explicitly specify the
        subsystem in question as css is intersection between cgroup and
        subsystem
      
      * For the planned unified hierarchy, css's would need to be created
        and destroyed dynamically independent from cgroup hierarchy.  Having
        cgroup core manage css iteration makes enforcing deref rules a lot
        easier.
      
      Most subsystem conversions are straight-forward.  Noteworthy changes
      are
      
      * blkio: cgroup_to_blkcg() is no longer used.  Removed.
      
      * freezer: cgroup_freezer() is no longer used.  Removed.
      
      * devices: cgroup_to_devcgroup() is no longer used.  Removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      492eb21b
    • T
      cgroup: always use cgroup_next_child() to walk the children list · f48e3924
      Tejun Heo 提交于
      There are several places where the children list is accessed directly.
      This patch converts those places to use cgroup_next_child().  This
      will help updating the hierarchy iterators to use @css instead of
      @cgrp.
      
      While cgroup_next_child() can be heavy in pathological cases - e.g. a
      lot of dead children, this shouldn't cause any noticeable behavior
      differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      f48e3924
    • T
      cgroup: convert cgroup_next_sibling() to cgroup_next_child() · 3b287a50
      Tejun Heo 提交于
      cgroup is transitioning to using css (cgroup_subsys_state) as the main
      subsys interface handle instead of cgroup and the iterators will be
      updated to use css too.  The iterators need to walk the cgroup
      hierarchy and return the css's matching the origin css, which is a bit
      cumbersome to open code.
      
      This patch converts cgroup_next_sibling() to cgroup_next_child() so
      that it can handle all steps of direct child iteration.  This will be
      used to update iterators to take @css instead of @cgrp.  In addition
      to the new iteration init handling, cgroup_next_child() is
      restructured so that the different branches share the end of iteration
      condition check.
      
      This patch doesn't change any behavior.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      3b287a50
    • T
      cgroup: pass around cgroup_subsys_state instead of cgroup in file methods · 182446d0
      Tejun Heo 提交于
      cgroup is currently in the process of transitioning to using struct
      cgroup_subsys_state * as the primary handle instead of struct cgroup.
      Please see the previous commit which converts the subsystem methods
      for rationale.
      
      This patch converts all cftype file operations to take @css instead of
      @cgroup.  cftypes for the cgroup core files don't have their subsytem
      pointer set.  These will automatically use the dummy_css added by the
      previous patch and can be converted the same way.
      
      Most subsystem conversions are straight forwards but there are some
      interesting ones.
      
      * freezer: update_if_frozen() is also converted to take @css instead
        of @cgroup for consistency.  This will make the code look simpler
        too once iterators are converted to use css.
      
      * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
        vmpressure while mem_cgroup_from_cont() can be made static.
        Updated accordingly.
      
      * cpu: cgroup_tg() doesn't have any user left.  Removed.
      
      * cpuacct: cgroup_ca() doesn't have any user left.  Removed.
      
      * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
        Removed.
      
      * net_cls: cgrp_cls_state() doesn't have any user left.  Removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      182446d0
    • T
      cgroup: add cgroup->dummy_css · 67f4c36f
      Tejun Heo 提交于
      cgroup subsystem API is being converted to use css
      (cgroup_subsys_state) as the main handle, which makes things a bit
      awkward for subsystem agnostic core features - the "cgroup.*"
      interface files and various iterations - a bit awkward as they don't
      have a css to use.
      
      This patch adds cgroup->dummy_css which has NULL ->ss and whose only
      role is pointing back to the cgroup.  This will be used to support
      subsystem agnostic features on the coming css based API.
      
      css_parent() is updated to handle dummy_css's.  Note that css will
      soon grow its own ->parent field and css_parent() will be made
      trivial.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      67f4c36f
    • T
      cgroup: pin cgroup_subsys_state when opening a cgroupfs file · f7d58818
      Tejun Heo 提交于
      Previously, each file read/write operation relied on the inode
      reference count pinning the cgroup and simply checked whether the
      cgroup was marked dead before proceeding to invoke the per-subsystem
      callback.  This was rather silly as it didn't have any synchronization
      or css pinning around the check and the cgroup may be removed and all
      css refs drained between the DEAD check and actual method invocation.
      
      This patch pins the css between open() and release() so that it is
      guaranteed to be alive for all file operations and remove the silly
      DEAD checks from cgroup_file_read/write().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      f7d58818
    • T
      cgroup: add subsys backlink pointer to cftype · 2bb566cb
      Tejun Heo 提交于
      cgroup is transitioning to using css (cgroup_subsys_state) instead of
      cgroup as the primary subsystem handle.  The cgroupfs file interface
      will be converted to use css's which requires finding out the
      subsystem from cftype so that the matching css can be determined from
      the cgroup.
      
      This patch adds cftype->ss which points to the subsystem the file
      belongs to.  The field is initialized while a cftype is being
      registered.  This makes it unnecessary to explicitly specify the
      subsystem for other cftype handling functions.  @ss argument dropped
      from various cftype handling functions.
      
      This patch shouldn't introduce any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      2bb566cb
    • T
      cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods · eb95419b
      Tejun Heo 提交于
      cgroup is currently in the process of transitioning to using struct
      cgroup_subsys_state * as the primary handle instead of struct cgroup *
      in subsystem implementations for the following reasons.
      
      * With unified hierarchy, subsystems will be dynamically bound and
        unbound from cgroups and thus css's (cgroup_subsys_state) may be
        created and destroyed dynamically over the lifetime of a cgroup,
        which is different from the current state where all css's are
        allocated and destroyed together with the associated cgroup.  This
        in turn means that cgroup_css() should be synchronized and may
        return NULL, making it more cumbersome to use.
      
      * Differing levels of per-subsystem granularity in the unified
        hierarchy means that the task and descendant iterators should behave
        differently depending on the specific subsystem the iteration is
        being performed for.
      
      * In majority of the cases, subsystems only care about its part in the
        cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
        often obtain the matching css pointer from the cgroup and don't
        bother with the cgroup pointer itself.  Passing around css fits
        much better.
      
      This patch converts all cgroup_subsys methods to take @css instead of
      @cgroup.  The conversions are mostly straight-forward.  A few
      noteworthy changes are
      
      * ->css_alloc() now takes css of the parent cgroup rather than the
        pointer to the new cgroup as the css for the new cgroup doesn't
        exist yet.  Knowing the parent css is enough for all the existing
        subsystems.
      
      * In kernel/cgroup.c::offline_css(), unnecessary open coded css
        dereference is replaced with local variable access.
      
      This patch shouldn't cause any behavior differences.
      
      v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced
          with local variable @css as suggested by Li Zefan.
      
          Rebased on top of new for-3.12 which includes for-3.11-fixes so
          that ->css_free() invocation added by da0a12ca ("cgroup: fix a
          leak when percpu_ref_init() fails") is converted too.  Suggested
          by Li Zefan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      eb95419b
    • T
      cgroup: add subsystem pointer to cgroup_subsys_state · 72c97e54
      Tejun Heo 提交于
      Currently, given a cgroup_subsys_state, there's no way to find out
      which subsystem the css is for, which we'll need to convert the cgroup
      controller API to primarily use @css instead of @cgroup.  This patch
      adds cgroup_subsys_state->ss which points to the subsystem the @css
      belongs to.
      
      While at it, remove the comment about accessing @css->cgroup to
      determine the hierarchy.  cgroup core will provide API to traverse
      hierarchy of css'es and we don't want subsystems to directly walk
      cgroup hierarchies anymore.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      72c97e54
    • T
      cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ · 8af01f56
      Tejun Heo 提交于
      The names of the two struct cgroup_subsys_state accessors -
      cgroup_subsys_state() and task_subsys_state() - are somewhat awkward.
      The former clashes with the type name and the latter doesn't even
      indicate it's somehow related to cgroup.
      
      We're about to revamp large portion of cgroup API, so, let's rename
      them so that they're less awkward.  Most per-controller usages of the
      accessors are localized in accessor wrappers and given the amount of
      scheduled changes, this isn't gonna add any noticeable headache.
      
      Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state()
      to task_css().  This patch is pure rename.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      8af01f56
  3. 01 8月, 2013 2 次提交
  4. 31 7月, 2013 5 次提交
  5. 16 7月, 2013 2 次提交
    • T
      cgroup: remove gratuituous BUG_ON()s from rebind_subsystems() · a698b448
      Tejun Heo 提交于
      rebind_subsystems() performs santiy checks even on subsystems which
      aren't specified to be added or removed and the checks aren't all that
      useful given that these are in a very cold path while the violations
      they check would trip up in much hotter paths.
      
      Let's remove these from rebind_subsystems().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      a698b448
    • T
      cgroup: move module ref handling into rebind_subsystems() · 1d5be6b2
      Tejun Heo 提交于
      Module ref handling in cgroup is rather weird.
      parse_cgroupfs_options() grabs all the modules for the specified
      subsystems.  A module ref is kept if the specified subsystem is newly
      bound to the hierarchy.  If not, or the operation fails, the refs are
      dropped.  This scatters module ref handling across multiple functions
      making it difficult to track.  It also make the function nasty to use
      for dynamic subsystem binding which is necessary for the planned
      unified hierarchy.
      
      There's nothing which requires the subsystem modules to be pinned
      between parse_cgroupfs_options() and rebind_subsystems() in both mount
      and remount paths.  parse_cgroupfs_options() can just parse and
      rebind_subsystems() can handle pinning the subsystems that it wants to
      bind, which is a natural part of its task - binding - anyway.
      
      Move module ref handling into rebind_subsystems() which makes the code
      a lot simpler - modules are gotten iff it's gonna be bound and put iff
      unbound or binding fails.
      
      v2: Li pointed out that if a controller module is unloaded between
          parsing and binding, rebind_subsystems() won't notice the missing
          controller as it only iterates through existing controllers.  Fix
          it by updating rebind_subsystems() to compare @added_mask to
          @pinned and fail with -ENOENT if they don't match.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      1d5be6b2
  6. 13 7月, 2013 8 次提交
    • T
      cgroup: replace task_cgroup_path_from_hierarchy() with task_cgroup_path() · 913ffdb5
      Tejun Heo 提交于
      task_cgroup_path_from_hierarchy() was added for the planned new users
      and none of the currently planned users wants to know about multiple
      hierarchies.  This patch drops the multiple hierarchy part and makes
      it always return the path in the first non-dummy hierarchy.
      
      As unified hierarchy will always have id 1, this is guaranteed to
      return the path for the unified hierarchy if mounted; otherwise, it
      will return the path from the hierarchy which happens to occupy the
      lowest hierarchy id, which will usually be the first hierarchy mounted
      after boot.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Lennart Poettering <lennart@poettering.net>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Jan Kaluža <jkaluza@redhat.com>
      913ffdb5
    • T
      cgroup: move number_of_cgroups test out of rebind_subsystems() into cgroup_remount() · f172e67c
      Tejun Heo 提交于
      rebind_subsystems() currently fails if the hierarchy has any !root
      cgroups; however, on the planned unified hierarchy,
      rebind_subsystems() will be used while populated.  Move the test to
      cgroup_remount(), which is the only place the test is necessary
      anyway.
      
      As it's impossible for the other two callers of rebind_subsystems() to
      have populated hierarchy, this doesn't make any behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      f172e67c
    • T
      cgroup: make rebind_subsystems() handle file additions and removals with proper error handling · 3126121f
      Tejun Heo 提交于
      Currently, creating and removing cgroup files in the root directory
      are handled separately from the actual subsystem binding and unbinding
      which happens in rebind_subsystems().  Also, rebind_subsystems() users
      aren't handling file creation errors properly.  Let's integrate
      top_cgroup file handling into rebind_subsystems() so that it's simpler
      to use and everyone handles file creation errors correctly.
      
      * On a successful return, rebind_subsystems() is guaranteed to have
        created all files of the new subsystems and deleted the ones
        belonging to the removed subsystems.  After a failure, no file is
        created or removed.
      
      * cgroup_remount() no longer needs to make explicit populate/clear
        calls as it's all handled by rebind_subsystems(), and it gets proper
        error handling automatically.
      
      * cgroup_mount() has been updated such that the root dentry and cgroup
        are linked before rebind_subsystems().  Also, the init_cred dancing
        and base file handling are moved right above rebind_subsystems()
        call and proper error handling for the base files is added.  While
        at it, add a comment explaining what's going on with the cred thing.
      
      * cgroup_kill_sb() calls rebind_subsystems() to unbind all subsystems
        which now implies removing all subsystem files which requires the
        directory's i_mutex.  Grab it.  This means that files on the root
        cgroup are removed earlier - they used to be deleted from generic
        super_block cleanup from vfs.  This doesn't lead to any functional
        difference and it's cleaner to do the clean up explicitly for all
        files.
      
      Combined with the previous changes, this makes all cgroup file
      creation errors handled correctly.
      
      v2: Added comment on init_cred.
      
      v3: Li spotted that cgroup_mount() wasn't freeing tmp_links after base
          file addition failure.  Fix it by adding free_tmp_links error
          handling label.
      
      v4: v3 introduced build bugs which got noticed by Fengguang's awesome
          kbuild test robot.  Fixed, and shame on me.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      3126121f
    • T
      cgroup: use for_each_subsys() instead of for_each_root_subsys() in cgroup_populate/clear_dir() · b420ba7d
      Tejun Heo 提交于
      rebind_subsystems() will be updated to handle file creations and
      removals with proper error handling and to do that will need to
      perform file operations before actually adding the subsystem to the
      hierarchy.
      
      To enable such usage, update cgroup_populate/clear_dir() to use
      for_each_subsys() instead of for_each_root_subsys() so that they
      operate on all subsystems specified by @subsys_mask whether that
      subsystem is currently bound to the hierarchy or not.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      b420ba7d
    • T
      cgroup: update error handling in cgroup_populate_dir() · bee55099
      Tejun Heo 提交于
      cgroup_populate_dir() didn't use to check whether the actual file
      creations were successful and could return success with only subset of
      the requested files created, which is nasty.
      
      This patch udpates cgroup_populate_dir() so that it either succeeds
      with all files or fails with no file.
      
      v2: The original patch also converted for_each_root_subsys() usages to
          for_each_subsys() without explaining why.  That part has been
          moved to a separate patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      bee55099
    • T
      cgroup: separate out cgroup_base_files[] handling out of cgroup_populate/clear_dir() · 628f7cd4
      Tejun Heo 提交于
      cgroup_populate/clear_dir() currently take @base_files and adds and
      removes, respectively, cgroup_base_files[] to the directory.  File
      additions and removals are being reorganized for proper error handling
      and more dynamic handling for the unified hierarchy, and mixing base
      and subsys file handling into the same functions gets a bit confusing.
      
      This patch moves base file handling out of cgroup_populate/clear_dir()
      into their users - cgroup_mount(), cgroup_create() and
      cgroup_destroy_locked().
      
      Note that this changes the behavior of base file removal.  If
      @base_files is %true, cgroup_clear_dir() used to delete files
      regardless of cftype until there's no files left.  Now, only files
      with matching cfts are removed.  As files can only be created by the
      base or registered cftypes, this shouldn't result in any behavior
      difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      628f7cd4
    • T
      cgroup: fix cgroup_add_cftypes() error handling · 9ccece80
      Tejun Heo 提交于
      cgroup_add_cftypes() uses cgroup_cfts_commit() to actually create the
      files; however, both functions ignore actual file creation errors and
      just assume success.  This can lead to, for example, blkio hierarchy
      with some of the cgroups with only subset of interface files populated
      after cfq-iosched is loaded under heavy memory pressure, which is
      nasty.
      
      This patch updates cgroup_cfts_commit() and cgroup_add_cftypes() to
      guarantee that all files are created on success and no file is created
      on failure.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      9ccece80
    • T
      cgroup: fix error path of cgroup_addrm_files() · b1f28d31
      Tejun Heo 提交于
      cgroup_addrm_files() mishandled error return value from
      cgroup_add_file() and returns error iff the last file fails to create.
      As we're in the process of cleaning up file add/rm error handling and
      will reliably propagate file creation failures, there's no point in
      keeping adding files after a failure.
      
      Replace the broken error collection logic with immediate error return.
      While at it, add lockdep assertions and function comment.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      b1f28d31