1. 22 3月, 2012 1 次提交
  2. 22 2月, 2012 2 次提交
    • F
      cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list · 3ce3230a
      Frederic Weisbecker 提交于
      Walking through the tasklist in cgroup_enable_task_cg_list() inside
      an RCU read side critical section is not enough because:
      
      - RCU is not (yet) safe against while_each_thread()
      
      - If we use only RCU, a forking task that has passed cgroup_post_fork()
        without seeing use_task_css_set_links == 1 is not guaranteed to have
        its child immediately visible in the tasklist if we walk through it
        remotely with RCU. In this case it will be missing in its css_set's
        task list.
      
      Thus we need to traverse the list (unfortunately) under the
      tasklist_lock. It makes us safe against while_each_thread() and also
      make sure we see all forked task that have been added to the tasklist.
      
      As a secondary effect, reading and writing use_task_css_set_links are
      now well ordered against tasklist traversing and modification. The new
      layout is:
      
      CPU 0                                      CPU 1
      
      use_task_css_set_links = 1                write_lock(tasklist_lock)
      read_lock(tasklist_lock)                  add task to tasklist
      do_each_thread() {                        write_unlock(tasklist_lock)
      	add thread to css set links       if (use_task_css_set_links)
      } while_each_thread()                         add thread to css set links
      read_unlock(tasklist_lock)
      
      If CPU 0 traverse the list after the task has been added to the tasklist
      then it is correctly added to the css set links. OTOH if CPU 0 traverse
      the tasklist before the new task had the opportunity to be added to the
      tasklist because it was too early in the fork process, then CPU 1
      catches up and add the task to the css set links after it added the task
      to the tasklist. The right value of use_task_css_set_links is guaranteed
      to be visible from CPU 1 due to the LOCK/UNLOCK implicit barrier properties:
      the read_unlock on CPU 0 makes the write on use_task_css_set_links happening
      and the write_lock on CPU 1 make the read of use_task_css_set_links that comes
      afterward to return the correct value.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Mandeep Singh Baines <msb@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      3ce3230a
    • F
      cgroup: Remove wrong comment on cgroup_enable_task_cg_list() · 9a4b4304
      Frederic Weisbecker 提交于
      Remove the stale comment about RCU protection. Many callers
      (all of them?) of cgroup_enable_task_cg_list() don't seem
      to be in an RCU read side critical section. Besides, RCU is
      not helpful to protect against while_each_thread().
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Mandeep Singh Baines <msb@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      9a4b4304
  3. 03 2月, 2012 1 次提交
    • L
      cgroup: remove cgroup_subsys argument from callbacks · 761b3ef5
      Li Zefan 提交于
      The argument is not used at all, and it's not necessary, because
      a specific callback handler of course knows which subsys it
      belongs to.
      
      Now only ->pupulate() takes this argument, because the handlers of
      this callback always call cgroup_add_file()/cgroup_add_files().
      
      So we reduce a few lines of code, though the shrinking of object size
      is minimal.
      
       16 files changed, 113 insertions(+), 162 deletions(-)
      
         text    data     bss     dec     hex filename
      5486240  656987 7039960 13183187         c928d3 vmlinux.o.orig
      5486170  656987 7039960 13183117         c9288d vmlinux.o
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      761b3ef5
  4. 31 1月, 2012 1 次提交
    • M
      cgroup: remove extra calls to find_existing_css_set · 61d1d219
      Mandeep Singh Baines 提交于
      In cgroup_attach_proc, we indirectly call find_existing_css_set 3
      times. It is an expensive call so we want to call it a minimum
      of times. This patch only calls it once and stores the result so
      that it can be used later on when we call cgroup_task_migrate.
      
      This required modifying cgroup_task_migrate to take the new css_set
      (which we obtained from find_css_set) as a parameter. The nice side
      effect of this is that cgroup_task_migrate is now identical for
      cgroup_attach_task and cgroup_attach_proc. It also now returns a
      void since it can never fail.
      
      Changes in V5:
      * https://lkml.org/lkml/2012/1/20/344 (Tejun Heo)
        * Remove css_set_refs
      Changes in V4:
      * https://lkml.org/lkml/2011/12/22/421 (Li Zefan)
        * Avoid GFP_KERNEL (sleep) in rcu_read_lock by getting css_set in
          a separate loop not under an rcu_read_lock
      Changes in V3:
      * https://lkml.org/lkml/2011/12/22/13 (Li Zefan)
        * Fixed earlier bug by creating a seperate patch to remove tasklist_lock
      Changes in V2:
      * https://lkml.org/lkml/2011/12/20/372 (Tejun Heo)
        * Move find_css_set call into loop which creates the flex array
      * Author
        * Kill css_set_refs and use group_size instead
        * Fix an off-by-one error in counting css_set refs
        * Add a retval check in out_list_teardown
      Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
      Acked-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: containers@lists.linux-foundation.org
      Cc: cgroups@vger.kernel.org
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Paul Menage <paul@paulmenage.org>
      61d1d219
  5. 21 1月, 2012 3 次提交
  6. 07 1月, 2012 1 次提交
  7. 06 1月, 2012 1 次提交
    • L
      cgroup: fix to allow mounting a hierarchy by name · 0d19ea86
      Li Zefan 提交于
      If we mount a hierarchy with a specified name, the name is unique,
      and we can use it to mount the hierarchy without specifying its
      set of subsystem names. This feature is documented is
      Documentation/cgroups/cgroups.txt section 2.3
      
      Here's an example:
      
      	# mount -t cgroup -o cpuset,name=myhier xxx /cgroup1
      	# mount -t cgroup -o name=myhier xxx /cgroup2
      
      But it was broken by commit 32a8cf23
      (cgroup: make the mount options parsing more accurate)
      
      This fixes the regression.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      0d19ea86
  8. 04 1月, 2012 3 次提交
  9. 28 12月, 2011 3 次提交
  10. 22 12月, 2011 5 次提交
  11. 20 12月, 2011 2 次提交
  12. 13 12月, 2011 6 次提交
  13. 03 11月, 2011 3 次提交
  14. 13 9月, 2011 1 次提交
  15. 27 7月, 2011 1 次提交
  16. 20 7月, 2011 1 次提交
  17. 09 7月, 2011 1 次提交
  18. 09 6月, 2011 1 次提交
    • E
      cgroupfs: use init_cred when populating new cgroupfs mount · 2ce9738b
      eparis@redhat 提交于
      We recently found that in some configurations SELinux was blocking the ability
      for cgroupfs to be mounted.  The reason for this is because cgroupfs creates
      files and directories during the get_sb() call and also uses lookup_one_len()
      during that same get_sb() call.  This is a problem since the security
      subsystem cannot initialize the superblock and the inodes in that filesystem
      until after the get_sb() call returns.  Thus we leave the inodes in
      an unitialized state during get_sb().  For the vast majority of filesystems
      this is not an issue, but since cgroupfs uses lookup_on_len() it does
      search permission checks on the directories in the path it walks.  Since the
      inode security state is not set up SELinux does these checks as if the inodes
      were 'unlabeled.'
      
      Many 'normal' userspace process do not have permission to interact with
      unlabeled inodes.  The solution presented here is to do the permission checks
      of path walk and inode creation as the kernel rather than as the task that
      called mount.  Since the kernel has permission to read/write/create
      unlabeled inodes the get_sb() call will complete successfully and the SELinux
      code will be able to initialize the superblock and those inodes created during
      the get_sb() call.
      
      This appears to be the same solution used by other filesystems such as devtmpfs
      to solve the same issue and should thus have no negative impact on other LSMs
      which currently work.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NPaul Menage <menage@google.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      2ce9738b
  19. 27 5月, 2011 3 次提交