1. 10 7月, 2014 9 次提交
    • L
      cpuset: refactor cpuset_hotplug_update_tasks() · 390a36aa
      Li Zefan 提交于
      We mix the handling for both default hierarchy and legacy hierarchy in
      the same function, and it's quite messy, so split into two functions.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      390a36aa
    • L
      cpuset: make cs->{cpus, mems}_allowed as user-configured masks · 7e88291b
      Li Zefan 提交于
      Now we've used effective cpumasks to enforce hierarchical manner,
      we can use cs->{cpus,mems}_allowed as configured masks.
      
      Configured masks can be changed by writing cpuset.cpus and cpuset.mems
      only. The new behaviors are:
      
      - They won't be changed by hotplug anymore.
      - They won't be limited by its parent's masks.
      
      This ia a behavior change, but won't take effect unless mount with
      sane_behavior.
      
      v2:
      - Add comments to explain the differences between configured masks and
      effective masks.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7e88291b
    • L
      cpuset: apply cs->effective_{cpus,mems} · ae1c8023
      Li Zefan 提交于
      Now we can use cs->effective_{cpus,mems} as effective masks. It's
      used whenever:
      
      - we update tasks' cpus_allowed/mems_allowed,
      - we want to retrieve tasks_cs(tsk)'s cpus_allowed/mems_allowed.
      
      They actually replace effective_{cpu,node}mask_cpuset().
      
      effective_mask == configured_mask & parent effective_mask except when
      the reault is empty, in which case it inherits parent effective_mask.
      The result equals the mask computed from effective_{cpu,node}mask_cpuset().
      
      This won't affect the original legacy hierarchy, because in this case we
      make sure the effective masks are always the same with user-configured
      masks.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      ae1c8023
    • L
      cpuset: initialize top_cpuset's configured masks at mount · 39bd0d15
      Li Zefan 提交于
      We now have to support different behaviors for default hierachy and
      legacy hiearchy, top_cpuset's configured masks need to be initialized
      accordingly.
      
      Suppose we've offlined cpu1.
      
      On default hierarchy:
      
      	# mount -t cgroup -o __DEVEL__sane_behavior xxx /cpuset
      	# cat /cpuset/cpuset.cpus
      	0-15
      
      On legacy hierarchy:
      
      	# mount -t cgroup xxx /cpuset
      	# cat /cpuset/cpuset.cpus
      	0,2-15
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      39bd0d15
    • L
      cpuset: use effective cpumask to build sched domains · 8b5f1c52
      Li Zefan 提交于
      We're going to have separate user-configured masks and effective ones.
      
      Eventually configured masks can only be changed by writing cpuset.cpus
      and cpuset.mems, and they won't be restricted by parent cpuset. While
      effective masks reflect cpu/memory hotplug and hierachical restriction,
      and these are the real masks that apply to the tasks in the cpuset.
      
      We calculate effective mask this way:
        - top cpuset's effective_mask == online_mask, otherwise
        - cpuset's effective_mask == configured_mask & parent effective_mask,
          if the result is empty, it inherits parent effective mask.
      
      Those behavior changes are for default hierarchy only. For legacy
      hierarchy, effective_mask and configured_mask are the same, so we won't
      break old interfaces.
      
      We should partition sched domains according to effective_cpus, which
      is the real cpulist that takes effects on tasks in the cpuset.
      
      This won't introduce behavior change.
      
      v2:
      - Add a comment for the call of rebuild_sched_domains(), suggested
      by Tejun.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      8b5f1c52
    • L
      cpuset: inherit ancestor's masks if effective_{cpus, mems} becomes empty · 554b0d1c
      Li Zefan 提交于
      We're going to have separate user-configured masks and effective ones.
      
      Eventually configured masks can only be changed by writing cpuset.cpus
      and cpuset.mems, and they won't be restricted by parent cpuset. While
      effective masks reflect cpu/memory hotplug and hierachical restriction,
      and these are the real masks that apply to the tasks in the cpuset.
      
      We calculate effective mask this way:
        - top cpuset's effective_mask == online_mask, otherwise
        - cpuset's effective_mask == configured_mask & parent effective_mask,
          if the result is empty, it inherits parent effective mask.
      
      Those behavior changes are for default hierarchy only. For legacy
      hierarchy, effective_mask and configured_mask are the same, so we won't
      break old interfaces.
      
      To make cs->effective_{cpus,mems} to be effective masks, we need to
        - update the effective masks at hotplug
        - update the effective masks at config change
        - take on ancestor's mask when the effective mask is empty
      
      The last item is done here.
      
      This won't introduce behavior change.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      554b0d1c
    • L
      cpuset: update cs->effective_{cpus, mems} when config changes · 734d4513
      Li Zefan 提交于
      We're going to have separate user-configured masks and effective ones.
      
      Eventually configured masks can only be changed by writing cpuset.cpus
      and cpuset.mems, and they won't be restricted by parent cpuset. While
      effective masks reflect cpu/memory hotplug and hierachical restriction,
      and these are the real masks that apply to the tasks in the cpuset.
      
      We calculate effective mask this way:
        - top cpuset's effective_mask == online_mask, otherwise
        - cpuset's effective_mask == configured_mask & parent effective_mask,
          if the result is empty, it inherits parent effective mask.
      
      Those behavior changes are for default hierarchy only. For legacy
      hierarchy, effective_mask and configured_mask are the same, so we won't
      break old interfaces.
      
      To make cs->effective_{cpus,mems} to be effective masks, we need to
        - update the effective masks at hotplug
        - update the effective masks at config change
        - take on ancestor's mask when the effective mask is empty
      
      The second item is done here. We don't need to treat root_cs specially
      in update_cpumasks_hier().
      
      This won't introduce behavior change.
      
      v3:
      - add a WARN_ON() to check if effective masks are the same with configured
        masks on legacy hierarchy.
      - pass trialcs->cpus_allowed to update_cpumasks_hier() and add a comment for
        it. Similar change for update_nodemasks_hier(). Suggested by Tejun.
      
      v2:
      - revise the comment in update_{cpu,node}masks_hier(), suggested by Tejun.
      - fix to use @CP instead of @cs in these two functions.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      734d4513
    • L
      cpuset: update cpuset->effective_{cpus,mems} at hotplug · 1344ab9c
      Li Zefan 提交于
      We're going to have separate user-configured masks and effective ones.
      
      Eventually configured masks can only be changed by writing cpuset.cpus
      and cpuset.mems, and they won't be restricted by parent cpuset. While
      effective masks reflect cpu/memory hotplug and hierachical restriction,
      and these are the real masks that apply to the tasks in the cpuset.
      
      We calculate effective mask this way:
        - top cpuset's effective_mask == online_mask, otherwise
        - cpuset's effective_mask == configured_mask & parent effective_mask,
          if the result is empty, it inherits parent effective mask.
      
      Those behavior changes are for default hierarchy only. For legacy
      hierarchy, effective_mask and configured_mask are the same, so we won't
      break old interfaces.
      
      To make cs->effective_{cpus,mems} to be effective masks, we need to
        - update the effective masks at hotplug
        - update the effective masks at config change
        - take on ancestor's mask when the effective mask is empty
      
      The first item is done here.
      
      This won't introduce behavior change.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1344ab9c
    • L
      cpuset: add cs->effective_cpus and cs->effective_mems · e2b9a3d7
      Li Zefan 提交于
      We're going to have separate user-configured masks and effective ones.
      
      Eventually configured masks can only be changed by writing cpuset.cpus
      and cpuset.mems, and they won't be restricted by parent cpuset. While
      effective masks reflect cpu/memory hotplug and hierachical restriction,
      and these are the real masks that apply to the tasks in the cpuset.
      
      We calculate effective mask this way:
        - top cpuset's effective_mask == online_mask, otherwise
        - cpuset's effective_mask == configured_mask & parent effective_mask,
          if the result is empty, it inherits parent effective mask.
      
      Those behavior changes are for default hierarchy only. For legacy
      hierachy, effective_mask and configured_mask are the same, so we won't
      break old interfaces.
      
      This patch adds the effective masks to struct cpuset and initializes
      them. The effective masks of the top cpuset is the same with configured
      masks, and a child cpuset inherits its parent's effective masks.
      
      This won't introduce behavior change.
      
      v2:
      - s/real_{mems,cpus}_allowed/effective_{mems,cpus}, suggested by Tejun.
      - don't init effective masks in cpuset_css_online() if !cgroup_on_dfl.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e2b9a3d7
  2. 09 7月, 2014 9 次提交
    • T
      cgroup: clean up sane_behavior handling · 7b9a6ba5
      Tejun Heo 提交于
      After the previous patch to remove sane_behavior support from
      non-default hierarchies, CGRP_ROOT_SANE_BEHAVIOR is used only to
      indicate the default hierarchy while parsing mount options.  This
      patch makes the following cleanups around it.
      
      * Don't show it in the mount option.  Eventually the default hierarchy
        will be assigned a different filesystem type.
      
      * As sane_behavior is no longer effective on non-default hierarchies
        and the default hierarchy doesn't accept any mount options,
        parse_cgroupfs_options() can consider sane_behavior mount option as
        indicating the default hierarchy and fail if any other options are
        specified with it.  While at it, remove one of the double blank
        lines in the function.
      
      * cgroup_mount() can now simply test CGRP_ROOT_SANE_BEHAVIOR to tell
        whether to mount the default hierarchy or not.
      
      * As CGROUP_ROOT_SANE_BEHAVIOR's only role now is indicating whether
        to select the default hierarchy or not during mount, it doesn't need
        to be set in the default hierarchy itself.  cgroup_init_early()
        updated accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      7b9a6ba5
    • T
      cgroup: remove sane_behavior support on non-default hierarchies · aa6ec29b
      Tejun Heo 提交于
      sane_behavior has been used as a development vehicle for the default
      unified hierarchy.  Now that the default hierarchy is in place, the
      flag became redundant and confusing as its usage is allowed on all
      hierarchies.  There are gonna be either the default hierarchy or
      legacy ones.  Let's make that clear by removing sane_behavior support
      on non-default hierarchies.
      
      This patch replaces cgroup_sane_behavior() with cgroup_on_dfl().  The
      comment on top of CGRP_ROOT_SANE_BEHAVIOR is moved to on top of
      cgroup_on_dfl() with sane_behavior specific part dropped.
      
      On the default and legacy hierarchies w/o sane_behavior, this
      shouldn't cause any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      aa6ec29b
    • T
      cgroup: make interface file "cgroup.sane_behavior" legacy-only · c1d5d42e
      Tejun Heo 提交于
      "cgroup.sane_behavior" is added to help distinguishing whether
      sane_behavior is in effect or not.  We now have the default hierarchy
      where the flag is always in effect and are planning to remove
      supporting sane behavior on the legacy hierarchies making this file on
      the default hierarchy rather pointless.  Let's make it legacy only and
      thus always zero.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      c1d5d42e
    • T
      cgroup: remove CGRP_ROOT_OPTION_MASK · 7450e90b
      Tejun Heo 提交于
      cgroup_root->flags only contains CGRP_ROOT_* flags and there's no
      reason to mask the flags.  Remove CGRP_ROOT_OPTION_MASK.
      
      This doesn't cause any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      7450e90b
    • T
      cgroup: implement cgroup_subsys->depends_on · af0ba678
      Tejun Heo 提交于
      Currently, the blkio subsystem attributes all of writeback IOs to the
      root.  One of the issues is that there's no way to tell who originated
      a writeback IO from block layer.  Those IOs are usually issued
      asynchronously from a task which didn't have anything to do with
      actually generating the dirty pages.  The memory subsystem, when
      enabled, already keeps track of the ownership of each dirty page and
      it's desirable for blkio to piggyback instead of adding its own
      per-page tag.
      
      blkio piggybacking on memory is an implementation detail which
      preferably should be handled automatically without requiring explicit
      userland action.  To achieve that, this patch implements
      cgroup_subsys->depends_on which contains the mask of subsystems which
      should be enabled together when the subsystem is enabled.
      
      The previous patches already implemented the support for enabled but
      invisible subsystems and cgroup_subsys->depends_on can be easily
      implemented by updating cgroup_refresh_child_subsys_mask() so that it
      calculates cgroup->child_subsys_mask considering
      cgroup_subsys->depends_on of the explicitly enabled subsystems.
      
      Documentation/cgroups/unified-hierarchy.txt is updated to explain that
      subsystems may not become immediately available after being unused
      from userland and that dependency could be a factor in it.  As
      subsystems may already keep residual references, this doesn't
      significantly change how subsystem rebinding can be used.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      af0ba678
    • T
      cgroup: implement cgroup_subsys->css_reset() · b4536f0c
      Tejun Heo 提交于
      cgroup is implementing support for subsystem dependency which would
      require a way to enable a subsystem even when it's not directly
      configured through "cgroup.subtree_control".
      
      The previous patches added support for explicitly and implicitly
      enabled subsystems and showing/hiding their interface files.  An
      explicitly enabled subsystem may become implicitly enabled if it's
      turned off through "cgroup.subtree_control" but there are subsystems
      depending on it.  In such cases, the subsystem, as it's turned off
      when seen from userland, shouldn't enforce any resource control.
      Also, the subsystem may be explicitly turned on later again and its
      interface files should be as close to the intial state as possible.
      
      This patch adds cgroup_subsys->css_reset() which is invoked when a css
      is hidden.  The callback should disable resource control and reset the
      state to the vanilla state.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      b4536f0c
    • T
      cgroup: make interface files visible iff enabled on cgroup->subtree_control · f63070d3
      Tejun Heo 提交于
      cgroup is implementing support for subsystem dependency which would
      require a way to enable a subsystem even when it's not directly
      configured through "cgroup.subtree_control".
      
      The preceding patch distinguished cgroup->subtree_control and
      ->child_subsys_mask where the former is the subsystems explicitly
      configured by the userland and the latter is all enabled subsystems
      currently is equal to the former but will include subsystems
      implicitly enabled through dependency.
      
      Subsystems which are enabled due to dependency shouldn't be visible to
      userland.  This patch updates cgroup_subtree_control_write() and
      create_css() such that interface files are not created for implicitly
      enabled subsytems.
      
      * @visible paramter is added to create_css().  Interface files are
        created only when true.
      
      * If an already implicitly enabled subsystem is turned on through
        "cgroup.subtree_control", the existing css should be used.  css
        draining is skipped.
      
      * cgroup_subtree_control_write() computes the new target
        cgroup->child_subsys_mask and create/kill or show/hide csses
        accordingly.
      
      As the two subsystem masks are still kept identical, this patch
      doesn't introduce any behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      f63070d3
    • T
      cgroup: introduce cgroup->subtree_control · 667c2491
      Tejun Heo 提交于
      cgroup is implementing support for subsystem dependency which would
      require a way to enable a subsystem even when it's not directly
      configured through "cgroup.subtree_control".
      
      Previously, cgroup->child_subsys_mask directly reflected
      "cgroup.subtree_control" and the enabled subsystems in the child
      cgroups.  This patch adds cgroup->subtree_control which
      "cgroup.subtree_control" operates on.  cgroup->child_subsys_mask is
      now calculated from cgroup->subtree_control by
      cgroup_refresh_child_subsys_mask(), which sets it identical to
      cgroup->subtree_control for now.
      
      This will allow using cgroup->child_subsys_mask for all the enabled
      subsystems including the implicit ones and ->subtree_control for
      tracking the explicitly requested ones.  This patch keeps the two
      masks identical and doesn't introduce any behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      667c2491
    • T
      cgroup: reorganize cgroup_subtree_control_write() · c29adf24
      Tejun Heo 提交于
      Make the following two reorganizations to
      cgroup_subtree_control_write().  These are to prepare for future
      changes and shouldn't cause any functional difference.
      
      * Move availability above css offlining wait.
      
      * Move cgrp->child_subsys_mask update above new css creation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      c29adf24
  3. 17 6月, 2014 2 次提交
  4. 16 6月, 2014 1 次提交
    • T
      rtmutex: Plug slow unlock race · 27e35715
      Thomas Gleixner 提交于
      When the rtmutex fast path is enabled the slow unlock function can
      create the following situation:
      
      spin_lock(foo->m->wait_lock);
      foo->m->owner = NULL;
      	    			rt_mutex_lock(foo->m); <-- fast path
      				free = atomic_dec_and_test(foo->refcnt);
      				rt_mutex_unlock(foo->m); <-- fast path
      				if (free)
      				   kfree(foo);
      
      spin_unlock(foo->m->wait_lock); <--- Use after free.
      
      Plug the race by changing the slow unlock to the following scheme:
      
           while (!rt_mutex_has_waiters(m)) {
           	    /* Clear the waiters bit in m->owner */
      	    clear_rt_mutex_waiters(m);
            	    owner = rt_mutex_owner(m);
            	    spin_unlock(m->wait_lock);
            	    if (cmpxchg(m->owner, owner, 0) == owner)
            	       return;
            	    spin_lock(m->wait_lock);
           }
      
      So in case of a new waiter incoming while the owner tries the slow
      path unlock we have two situations:
      
       unlock(wait_lock);
      					lock(wait_lock);
       cmpxchg(p, owner, 0) == owner
       	    	   			mark_rt_mutex_waiters(lock);
      	 				acquire(lock);
      
      Or:
      
       unlock(wait_lock);
      					lock(wait_lock);
      	 				mark_rt_mutex_waiters(lock);
       cmpxchg(p, owner, 0) != owner
      					enqueue_waiter();
      					unlock(wait_lock);
       lock(wait_lock);
       wakeup_next waiter();
       unlock(wait_lock);
      					lock(wait_lock);
      					acquire(lock);
      
      If the fast path is disabled, then the simple
      
         m->owner = NULL;
         unlock(m->wait_lock);
      
      is sufficient as all access to m->owner is serialized via
      m->wait_lock;
      
      Also document and clarify the wakeup_next_waiter function as suggested
      by Oleg Nesterov.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140611183852.937945560@linutronix.de
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      27e35715
  5. 14 6月, 2014 1 次提交
  6. 11 6月, 2014 4 次提交
  7. 10 6月, 2014 3 次提交
  8. 09 6月, 2014 3 次提交
  9. 07 6月, 2014 8 次提交