1. 24 5月, 2013 2 次提交
    • T
      cgroup: make cgroup_is_removed() static · bdc7119f
      Tejun Heo 提交于
      cgroup_is_removed() no longer has external users and it shouldn't grow
      any - controllers should deal with cgroup_subsys_state on/offline
      state instead of cgroup removal state.  Make it static.
      
      While at it, make it return bool.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      bdc7119f
    • T
      cgroup: fix a subtle bug in descendant pre-order walk · 7805d000
      Tejun Heo 提交于
      When cgroup_next_descendant_pre() initiates a walk, it checks whether
      the subtree root doesn't have any children and if not returns NULL.
      Later code assumes that the subtree isn't empty.  This is broken
      because the subtree may become empty inbetween, which can lead to the
      traversal escaping the subtree by walking to the sibling of the
      subtree root.
      
      There's no reason to have the early exit path.  Remove it along with
      the later assumption that the subtree isn't empty.  This simplifies
      the code a bit and fixes the subtle bug.
      
      While at it, fix the comment of cgroup_for_each_descendant_pre() which
      was incorrectly referring to ->css_offline() instead of
      ->css_online().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: stable@vger.kernel.org
      7805d000
  2. 15 5月, 2013 2 次提交
  3. 08 5月, 2013 1 次提交
  4. 02 5月, 2013 1 次提交
  5. 30 4月, 2013 1 次提交
  6. 19 4月, 2013 1 次提交
    • L
      cgroup: fix broken file xattrs · 712317ad
      Li Zefan 提交于
      We should store file xattrs in struct cfent instead of struct cftype,
      because cftype is a type while cfent is object instance of cftype.
      
      For example each cgroup has a tasks file, and each tasks file is
      associated with a uniq cfent, but all those files share the same
      struct cftype.
      
      Alexey Kodanev reported a crash, which can be reproduced:
      
        # mount -t cgroup -o xattr /sys/fs/cgroup
        # mkdir /sys/fs/cgroup/test
        # setfattr -n trusted.value -v test_value /sys/fs/cgroup/tasks
        # rmdir /sys/fs/cgroup/test
        # umount /sys/fs/cgroup
        oops!
      
      In this case, simple_xattrs_free() will free the same struct simple_xattrs
      twice.
      
      tj: Dropped unused local variable @cft from cgroup_diput().
      
      Cc: <stable@vger.kernel.org> # 3.8.x
      Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      712317ad
  7. 16 4月, 2013 1 次提交
  8. 15 4月, 2013 3 次提交
    • L
      cgroup: remove cgrp->top_cgroup · 05fb22ec
      Li Zefan 提交于
      It's not used, and it can be retrieved via cgrp->root->top_cgroup.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      05fb22ec
    • T
      cgroup: introduce sane_behavior mount option · 873fe09e
      Tejun Heo 提交于
      It's a sad fact that at this point various cgroup controllers are
      carrying so many idiosyncrasies and pure insanities that it simply
      isn't possible to reach any sort of sane consistent behavior while
      maintaining staying fully compatible with what already has been
      exposed to userland.
      
      As we can't break exposed userland interface, transitioning to sane
      behaviors can only be done in steps while maintaining backwards
      compatibility.  This patch introduces a new mount option -
      __DEVEL__sane_behavior - which disables crazy features and enforces
      consistent behaviors in cgroup core proper and various controllers.
      As exactly which behaviors it changes are still being determined, the
      mount option, at this point, is useful only for development of the new
      behaviors.  As such, the mount option is prefixed with __DEVEL__ and
      generates a warning message when used.
      
      Eventually, once we get to the point where all controller's behaviors
      are consistent enough to implement unified hierarchy, the __DEVEL__
      prefix will be dropped, and more importantly, unified-hierarchy will
      enforce sane_behavior by default.  Maybe we'll able to completely drop
      the crazy stuff after a while, maybe not, but we at least have a
      strategy to move on to saner behaviors.
      
      This patch introduces the mount option and changes the following
      behaviors in cgroup core.
      
      * Mount options "noprefix" and "clone_children" are disallowed.  Also,
        cgroupfs file cgroup.clone_children is not created.
      
      * When mounting an existing superblock, mount options should match.
        This is currently pretty crazy.  If one mounts a cgroup, creates a
        subdirectory, unmounts it and then mount it again with different
        option, it looks like the new options are applied but they aren't.
      
      * Remount is disallowed.
      
      The behaviors changes are documented in the comment above
      CGRP_ROOT_SANE_BEHAVIOR enum and will be expanded as different
      controllers are converted and planned improvements progress.
      
      v2: Dropped unnecessary explicit file permission setting sane_behavior
          cftype entry as suggested by Li Zefan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      873fe09e
    • T
      move cgroupfs_root to include/linux/cgroup.h · 25a7e684
      Tejun Heo 提交于
      While controllers shouldn't be accessing cgroupfs_root directly, it
      being hidden inside kern/cgroup.c makes somethings pretty silly.  This
      makes routing hierarchy-wide settings which need to be visible to
      controllers cumbersome.
      
      We're gonna add another hierarchy-wide setting which needs to be
      accessed from controllers.  Move cgroupfs_root and its flags to the
      header file so that we can access root settings with inline helpers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      25a7e684
  9. 13 4月, 2013 1 次提交
  10. 11 4月, 2013 2 次提交
    • L
      cgroup: implement cgroup_is_descendant() · 78574cf9
      Li Zefan 提交于
      A couple controllers want to determine whether two cgroups are in
      ancestor/descendant relationship.  As it's more likely that the
      descendant is the primary subject of interest and there are other
      operations focusing on the descendants, let's ask is_descendent rather
      than is_ancestor.
      
      Implementation is trivial as the previous patch guarantees that all
      ancestors of a cgroup stay accessible as long as the cgroup is
      accessible.
      
      tj: Removed depth optimization, renamed from cgroup_is_ancestor(),
          rewrote descriptions.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      78574cf9
    • R
      cgroup: remove bind() method from cgroup_subsys. · 84cfb6ab
      Rami Rosen 提交于
      The bind() method of cgroup_subsys is not used in any of the
      controllers (cpuset, freezer, blkio, net_cls, memcg, net_prio,
      devices, perf, hugetlb, cpu and cpuacct)
      
      tj: Removed the entry on ->bind() from
          Documentation/cgroups/cgroups.txt.  Also updated a couple
          paragraphs which were suggesting that dynamic re-binding may be
          implemented.  It's not gonna.
      Signed-off-by: NRami Rosen <ramirose@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      84cfb6ab
  11. 10 4月, 2013 1 次提交
  12. 08 4月, 2013 3 次提交
    • T
      cgroup: remove cgroup_lock_is_held() · 2219449a
      Tejun Heo 提交于
      We don't want controllers to assume that the information is officially
      available and do funky things with it.
      
      The only user is task_subsys_state_check() which uses it to verify RCU
      access context.  We can move cgroup_lock_is_held() inside
      CONFIG_PROVE_RCU but that doesn't add meaningful protection compared
      to conditionally exposing cgroup_mutex.
      
      Remove cgroup_lock_is_held(), export cgroup_mutex iff CONFIG_PROVE_RCU
      and use lockdep_is_held() directly on the mutex in
      task_subsys_state_check().
      
      While at it, add parentheses around macro arguments in
      task_subsys_state_check().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      2219449a
    • T
      cgroup: unexport locking interface and cgroup_attach_task() · b9777cf8
      Tejun Heo 提交于
      Now that all external cgroup_lock() users are gone, we can finally
      unexport the locking interface and prevent future abuse of
      cgroup_mutex.
      
      Make cgroup_[un]lock() and cgroup_lock_live_group() static.  Also,
      cgroup_attach_task() doesn't have any user left and can't be used
      without locking interface anyway.  Make it static too.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      b9777cf8
    • T
      cgroup, cpuset: replace move_member_tasks_to_cpuset() with cgroup_transfer_tasks() · 8cc99345
      Tejun Heo 提交于
      When a cpuset becomes empty (no CPU or memory), its tasks are
      transferred with the nearest ancestor with execution resources.  This
      is implemented using cgroup_scan_tasks() with a callback which grabs
      cgroup_mutex and invokes cgroup_attach_task() on each task.
      
      Both cgroup_mutex and cgroup_attach_task() are scheduled to be
      unexported.  Implement cgroup_transfer_tasks() in cgroup proper which
      is essentially the same as move_member_tasks_to_cpuset() except that
      it takes cgroups instead of cpusets and @to comes before @from like
      normal functions with those arguments, and replace
      move_member_tasks_to_cpuset() with it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      8cc99345
  13. 20 3月, 2013 1 次提交
  14. 13 3月, 2013 1 次提交
  15. 06 3月, 2013 1 次提交
  16. 05 3月, 2013 1 次提交
    • L
      cgroup: fix cgroup_path() vs rename() race · 65dff759
      Li Zefan 提交于
      rename() will change dentry->d_name. The result of this race can
      be worse than seeing partially rewritten name, but we might access
      a stale pointer because rename() will re-allocate memory to hold
      a longer name.
      
      As accessing dentry->name must be protected by dentry->d_lock or
      parent inode's i_mutex, while on the other hand cgroup-path() can
      be called with some irq-safe spinlocks held, we can't generate
      cgroup path using dentry->d_name.
      
      Alternatively we make a copy of dentry->d_name and save it in
      cgrp->name when a cgroup is created, and update cgrp->name at
      rename().
      
      v5: use flexible array instead of zero-size array.
      v4: - allocate root_cgroup_name and all root_cgroup->name points to it.
          - add cgroup_name() wrapper.
      v3: use kfree_rcu() instead of synchronize_rcu() in user-visible path.
      v2: make cgrp->name RCU safe.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      65dff759
  17. 25 1月, 2013 1 次提交
  18. 08 1月, 2013 2 次提交
  19. 20 11月, 2012 9 次提交
  20. 10 11月, 2012 3 次提交
    • T
      cgroup: implement generic child / descendant walk macros · 574bd9f7
      Tejun Heo 提交于
      Currently, cgroup doesn't provide any generic helper for walking a
      given cgroup's children or descendants.  This patch adds the following
      three macros.
      
      * cgroup_for_each_child() - walk immediate children of a cgroup.
      
      * cgroup_for_each_descendant_pre() - visit all descendants of a cgroup
        in pre-order tree traversal.
      
      * cgroup_for_each_descendant_post() - visit all descendants of a
        cgroup in post-order tree traversal.
      
      All three only require the user to hold RCU read lock during
      traversal.  Verifying that each iterated cgroup is online is the
      responsibility of the user.  When used with proper synchronization,
      cgroup_for_each_descendant_pre() can be used to propagate state
      updates to descendants in reliable way.  See comments for details.
      
      v2: s/config/state/ in commit message and comments per Michal.  More
          documentation on synchronization rules.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujisu.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      574bd9f7
    • T
      cgroup: use rculist ops for cgroup->children · eb6fd504
      Tejun Heo 提交于
      Use RCU safe list operations for cgroup->children.  This will be used
      to implement cgroup children / descendant walking which can be used by
      controllers.
      
      Note that cgroup_create() now puts a new cgroup at the end of the
      ->children list instead of head.  This isn't strictly necessary but is
      done so that the iteration order is more conventional.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      eb6fd504
    • T
      cgroup: add cgroup_subsys->post_create() · a8638030
      Tejun Heo 提交于
      Currently, there's no way for a controller to find out whether a new
      cgroup finished all ->create() allocatinos successfully and is
      considered "live" by cgroup.
      
      This becomes a problem later when we add generic descendants walking
      to cgroup which can be used by controllers as controllers don't have a
      synchronization point where it can synchronize against new cgroups
      appearing in such walks.
      
      This patch adds ->post_create().  It's called after all ->create()
      succeeded and the cgroup is linked into the generic cgroup hierarchy.
      This plays the counterpart of ->pre_destroy().
      
      When used in combination with the to-be-added generic descendant
      iterators, ->post_create() can be used to implement reliable state
      inheritance.  It will be explained with the descendant iterators.
      
      v2: Added a paragraph about its future use w/ descendant iterators per
          Michal.
      
      v3: Forgot to add ->post_create() invocation to cgroup_load_subsys().
          Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Glauber Costa <glommer@parallels.com>
      a8638030
  21. 06 11月, 2012 2 次提交