1. 25 1月, 2013 5 次提交
    • L
      cgroup: remove duplicate RCU free on struct cgroup · 86a3db56
      Li Zefan 提交于
      When destroying a cgroup, though in cgroup_diput() we've called
      synchronize_rcu(), we then still have to free it via call_rcu().
      
      The story is, long ago to fix a race between reading /proc/sched_debug
      and freeing cgroup, the code was changed to utilize call_rcu(). See
      commit a47295e6 ("cgroups: make
      cgroup_path() RCU-safe")
      
      As we've fixed cpu cgroup that cpu_cgroup_offline_css() is used
      to unregister a task_group so there won't be concurrent access
      to this task_group after synchronize_rcu() in diput(). Now we can
      just kfree(cgrp).
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      86a3db56
    • L
      sched: remove redundant NULL cgroup check in task_group_path() · 2a73991b
      Li Zefan 提交于
      A task_group won't be online (thus no one can see it) until
      cpu_cgroup_css_online(), and at that time tg->css.cgroup has
      been initialized, so this NULL check is redundant.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      2a73991b
    • L
      sched: split out css_online/css_offline from tg creation/destruction · ace783b9
      Li Zefan 提交于
      This is a preparaton for later patches.
      
      - What do we gain from cpu_cgroup_css_online():
      
      After ss->css_alloc() and before ss->css_online(), there's a small
      window that tg->css.cgroup is NULL. With this change, tg won't be seen
      before ss->css_online(), where it's added to the global list, so we're
      guaranteed we'll never see NULL tg->css.cgroup.
      
      - What do we gain from cpu_cgroup_css_offline():
      
      tg is freed via RCU, so is cgroup. Without this change, This is how
      synchronization works:
      
      cgroup_rmdir()
        no ss->css_offline()
      diput()
        syncornize_rcu()
        ss->css_free()       <-- unregister tg, and free it via call_rcu()
        kfree_rcu(cgroup)    <-- wait possible refs to cgroup, and free cgroup
      
      We can't just kfree(cgroup), because tg might access tg->css.cgroup.
      
      With this change:
      
      cgroup_rmdir()
        ss->css_offline()    <-- unregister tg
      diput()
        synchronize_rcu()    <-- wait possible refs to tg and cgroup
        ss->css_free()       <-- free tg
        kfree_rcu(cgroup)    <-- free cgroup
      
      As you see, kfree_rcu() is redundant now.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      ace783b9
    • L
      cgroup: initialize cgrp->dentry before css_alloc() · fe1c06ca
      Li Zefan 提交于
      With this change, we're guaranteed that cgroup_path() won't see NULL
      cgrp->dentry, and thus we can remove the NULL check in it.
      
      (Well, it's not strictly true, because dummptop.dentry is always NULL
       but we already handle that separately.)
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      fe1c06ca
    • L
      cgroup: remove a NULL check in cgroup_exit() · b5d646f5
      Li Zefan 提交于
      init_task.cgroups is initialized at boot phase, and whenver a ask
      is forked, it's cgroups pointer is inherited from its parent, and
      it's never set to NULL afterwards.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b5d646f5
  2. 23 1月, 2013 1 次提交
    • L
      cgroup: fix bogus kernel warnings when cgroup_create() failed · 2739d3cc
      Li Zefan 提交于
      If cgroup_create() failed and cgroup_destroy_locked() is called to
      do cleanup, we'll see a bunch of warnings:
      
      cgroup_addrm_files: failed to remove 2MB.limit_in_bytes, err=-2
      cgroup_addrm_files: failed to remove 2MB.usage_in_bytes, err=-2
      cgroup_addrm_files: failed to remove 2MB.max_usage_in_bytes, err=-2
      cgroup_addrm_files: failed to remove 2MB.failcnt, err=-2
      cgroup_addrm_files: failed to remove prioidx, err=-2
      cgroup_addrm_files: failed to remove ifpriomap, err=-2
      ...
      
      We failed to remove those files, because cgroup_create() has failed
      before creating those cgroup files.
      
      To fix this, we simply don't warn if cgroup_rm_file() can't find the
      cft entry.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      2739d3cc
  3. 15 1月, 2013 2 次提交
    • L
      cgroup: remove synchronize_rcu() from rebind_subsystems() · 130e3695
      Li Zefan 提交于
      Nothing's protected by RCU in rebind_subsystems(), and I can't think
      of a reason why it is needed.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      130e3695
    • L
      cgroup: remove synchronize_rcu() from cgroup_attach_{task|proc}() · 5d65bc0c
      Li Zefan 提交于
      These 2 syncronize_rcu()s make attaching a task to a cgroup
      quite slow, and it can't be ignored in some situations.
      
      A real case from Colin Cross: Android uses cgroups heavily to
      manage thread priorities, putting threads in a background group
      with reduced cpu.shares when they are not visible to the user,
      and in a foreground group when they are. Some RPCs from foreground
      threads to background threads will temporarily move the background
      thread into the foreground group for the duration of the RPC.
      This results in many calls to cgroup_attach_task.
      
      In cgroup_attach_task() it's task->cgroups that is protected by RCU,
      and put_css_set() calls kfree_rcu() to free it.
      
      If we remove this synchronize_rcu(), there can be threads in RCU-read
      sections accessing their old cgroup via current->cgroups with
      concurrent rmdir operation, but this is safe.
      
       # time for ((i=0; i<50; i++)) { echo $$ > /mnt/sub/tasks; echo $$ > /mnt/tasks; }
      
      real    0m2.524s
      user    0m0.008s
      sys     0m0.004s
      
      With this patch:
      
      real    0m0.004s
      user    0m0.004s
      sys     0m0.000s
      
      tj: These synchronize_rcu()s are utterly confused.  synchornize_rcu()
          necessarily has to come between two operations to guarantee that
          the changes made by the former operation are visible to all rcu
          readers before proceeding to the latter operation.  Here,
          synchornize_rcu() are at the end of attach operations with nothing
          beyond it.  Its only effect would be delaying completion of
          write(2) to sysfs tasks/procs files until all rcu readers see the
          change, which doesn't mean anything.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NColin Cross <ccross@google.com>
      5d65bc0c
  4. 11 1月, 2013 1 次提交
  5. 08 1月, 2013 1 次提交
  6. 26 12月, 2012 1 次提交
    • E
      pidns: Stop pid allocation when init dies · c876ad76
      Eric W. Biederman 提交于
      Oleg pointed out that in a pid namespace the sequence.
      - pid 1 becomes a zombie
      - setns(thepidns), fork,...
      - reaping pid 1.
      - The injected processes exiting.
      
      Can lead to processes attempting access their child reaper and
      instead following a stale pointer.
      
      That waitpid for init can return before all of the processes in
      the pid namespace have exited is also unfortunate.
      
      Avoid these problems by disabling the allocation of new pids in a pid
      namespace when init dies, instead of when the last process in a pid
      namespace is reaped.
      Pointed-out-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      c876ad76
  7. 25 12月, 2012 1 次提交
  8. 21 12月, 2012 2 次提交
  9. 20 12月, 2012 7 次提交
  10. 19 12月, 2012 3 次提交
  11. 18 12月, 2012 9 次提交
  12. 17 12月, 2012 1 次提交
  13. 15 12月, 2012 3 次提交
  14. 14 12月, 2012 3 次提交