• S
    sched: fix copy_namespace() <-> sched_fork() dependency in do_fork · 3c90e6e9
    Srivatsa Vaddagiri 提交于
    Sukadev Bhattiprolu reported a kernel crash with control groups.
    There are couple of problems discovered by Suka's test:
    
    - The test requires the cgroup filesystem to be mounted with
      atleast the cpu and ns options (i.e both namespace and cpu 
      controllers are active in the same hierarchy). 
    
    	# mkdir /dev/cpuctl
    	# mount -t cgroup -ocpu,ns none cpuctl
    	(or simply)
    	# mount -t cgroup none cpuctl -> Will activate all controllers
    					 in same hierarchy.
    
    - The test invokes clone() with CLONE_NEWNS set. This causes a a new child
      to be created, also a new group (do_fork->copy_namespaces->ns_cgroup_clone->
      cgroup_clone) and the child is attached to the new group (cgroup_clone->
      attach_task->sched_move_task). At this point in time, the child's scheduler 
      related fields are uninitialized (including its on_rq field, which it has
      inherited from parent). As a result sched_move_task thinks its on
      runqueue, when it isn't.
    
      As a solution to this problem, I moved sched_fork() call, which
      initializes scheduler related fields on a new task, before
      copy_namespaces(). I am not sure though whether moving up will
      cause other side-effects. Do you see any issue?
    
    - The second problem exposed by this test is that task_new_fair()
      assumes that parent and child will be part of the same group (which 
      needn't be as this test shows). As a result, cfs_rq->curr can be NULL
      for the child.
    
      The solution is to test for curr pointer being NULL in
      task_new_fair().
    
    With the patch below, I could run ns_exec() fine w/o a crash.
    Reported-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
    Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
    Signed-off-by: NIngo Molnar <mingo@elte.hu>
    3c90e6e9
sched_fair.c 26.8 KB