1. 23 7月, 2014 6 次提交
    • L
      workqueue: use nr_node_ids instead of wq_numa_tbl_len · ddcb57e2
      Lai Jiangshan 提交于
      They are the same and nr_node_ids is provided by the memory subsystem.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      ddcb57e2
    • L
      workqueue: remove the misnamed out_unlock label in get_unbound_pool() · 3fb1823c
      Lai Jiangshan 提交于
      After the locking was moved up to the caller of the get_unbound_pool(),
      out_unlock label doesn't need to do any unlock operation and the name
      became bad, so we just remove this label, and the only usage-site
      "goto out_unlock" is subsituted to "return pool".
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      3fb1823c
    • L
      workqueue: remove the stale comment in pwq_unbound_release_workfn() · 29b1cb41
      Lai Jiangshan 提交于
      In 75ccf595 ("workqueue: prepare flush_workqueue() for dynamic
      creation and destrucion of unbound pool_workqueues"), a comment
      about the synchronization for the pwq in pwq_unbound_release_workfn()
      was added. The comment claimed the flush_mutex wasn't strictly
      necessary, it was correct in that time, due to the pwq was protected
      by workqueue_lock.
      
      But it is incorrect now since the wq->flush_mutex was renamed to
      wq->mutex and workqueue_lock was removed, the wq->mutex is strictly
      needed. But the comment was miss-updated when the synchronization
      was changed.
      
      This patch removes the incorrect comments and doesn't add any new
      comment to explain why wq->mutex is needed here, which is definitely
      obvious and wq->pwqs_node has "WQ" notation in its definition which is
      better comment.
      
      The old commit mentioned above also introduced a comment in link_pwq()
      about the synchronization. This comment is also removed in this patch
      since the whole link_pwq() is proteced by wq->mutex.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      29b1cb41
    • L
      workqueue: move rescuer pool detachment to the end · 13b1d625
      Lai Jiangshan 提交于
      In 51697d39 ("workqueue: use generic attach/detach routine for
      rescuers"), The rescuer detaches itself from the pool before put_pwq()
      so that the put_unbound_pool() will not destroy the rescuer-attached
      pool.
      
      It is unnecessary.  worker_detach_from_pool() can be used as the last
      statement to access to the pool just like the regular workers,
      put_unbound_pool() will wait for it to detach and then free the pool.
      
      So we move the worker_detach_from_pool() down, make it coincide with
      the regular workers.
      
      tj: Minor description update.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      13b1d625
    • L
      workqueue: unfold start_worker() into create_worker() · 051e1850
      Lai Jiangshan 提交于
      Simply unfold the code of start_worker() into create_worker() and
      remove the original start_worker() and create_and_start_worker().
      
      The only trade-off is the introduced overhead that the pool->lock
      is released and regrabbed after the newly worker is started.
      The overhead is acceptible since the manager is slow path.
      
      And because this new locking behavior, the newly created worker
      may grab the lock earlier than the manager and go to process
      work items. In this case, the recheck need_to_create_worker() may be
      true as expected and the manager goes to restart which is the
      correct behavior.
      
      tj: Minor updates to description and comments.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      051e1850
    • L
      workqueue: remove @wakeup from worker_set_flags() · 228f1d00
      Lai Jiangshan 提交于
      worker_set_flags() has only two callers, each specifying %true and
      %false for @wakeup.  Let's push the wake up to the caller and remove
      @wakeup from worker_set_flags().  The caller can use the following
      instead if wakeup is necessary:
      
      	worker_set_flags();
      	if (need_more_worker(pool))
       		wake_up_worker(pool);
      
      This makes the code simpler.  This patch doesn't introduce behavior
      changes.
      
      tj: Updated description and comments.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      228f1d00
  2. 22 7月, 2014 1 次提交
    • L
      workqueue: remove an unneeded UNBOUND test before waking up the next worker · a489a03e
      Lai Jiangshan 提交于
      In process_one_work():
      
      	if ((worker->flags & WORKER_UNBOUND) && need_more_worker(pool))
      		wake_up_worker(pool);
      
      the first test is unneeded.  Even if the first test is removed, it
      doesn't affect the wake-up logic for WORKER_UNBOUND, and it will not
      introduce any useless wake-ups for normal per-cpu workers since
      nr_running is always >= 1.  It will introduce useless/redundant
      wake-ups for CPU_INTENSIVE, but this case is rare and the next patch
      will also remove this redundant wake-up.
      
      tj: Minor updates to the description and comment.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a489a03e
  3. 19 7月, 2014 1 次提交
  4. 15 7月, 2014 1 次提交
  5. 11 7月, 2014 1 次提交
  6. 07 7月, 2014 1 次提交
    • Y
      workqueue: zero cpumask of wq_numa_possible_cpumask on init · 5a6024f1
      Yasuaki Ishimatsu 提交于
      When hot-adding and onlining CPU, kernel panic occurs, showing following
      call trace.
      
        BUG: unable to handle kernel paging request at 0000000000001d08
        IP: [<ffffffff8114acfd>] __alloc_pages_nodemask+0x9d/0xb10
        PGD 0
        Oops: 0000 [#1] SMP
        ...
        Call Trace:
         [<ffffffff812b8745>] ? cpumask_next_and+0x35/0x50
         [<ffffffff810a3283>] ? find_busiest_group+0x113/0x8f0
         [<ffffffff81193bc9>] ? deactivate_slab+0x349/0x3c0
         [<ffffffff811926f1>] new_slab+0x91/0x300
         [<ffffffff815de95a>] __slab_alloc+0x2bb/0x482
         [<ffffffff8105bc1c>] ? copy_process.part.25+0xfc/0x14c0
         [<ffffffff810a3c78>] ? load_balance+0x218/0x890
         [<ffffffff8101a679>] ? sched_clock+0x9/0x10
         [<ffffffff81105ba9>] ? trace_clock_local+0x9/0x10
         [<ffffffff81193d1c>] kmem_cache_alloc_node+0x8c/0x200
         [<ffffffff8105bc1c>] copy_process.part.25+0xfc/0x14c0
         [<ffffffff81114d0d>] ? trace_buffer_unlock_commit+0x4d/0x60
         [<ffffffff81085a80>] ? kthread_create_on_node+0x140/0x140
         [<ffffffff8105d0ec>] do_fork+0xbc/0x360
         [<ffffffff8105d3b6>] kernel_thread+0x26/0x30
         [<ffffffff81086652>] kthreadd+0x2c2/0x300
         [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60
         [<ffffffff815f20ec>] ret_from_fork+0x7c/0xb0
         [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60
      
      In my investigation, I found the root cause is wq_numa_possible_cpumask.
      All entries of wq_numa_possible_cpumask is allocated by
      alloc_cpumask_var_node(). And these entries are used without initializing.
      So these entries have wrong value.
      
      When hot-adding and onlining CPU, wq_update_unbound_numa() is called.
      wq_update_unbound_numa() calls alloc_unbound_pwq(). And alloc_unbound_pwq()
      calls get_unbound_pool(). In get_unbound_pool(), worker_pool->node is set
      as follow:
      
      3592         /* if cpumask is contained inside a NUMA node, we belong to that node */
      3593         if (wq_numa_enabled) {
      3594                 for_each_node(node) {
      3595                         if (cpumask_subset(pool->attrs->cpumask,
      3596                                            wq_numa_possible_cpumask[node])) {
      3597                                 pool->node = node;
      3598                                 break;
      3599                         }
      3600                 }
      3601         }
      
      But wq_numa_possible_cpumask[node] does not have correct cpumask. So, wrong
      node is selected. As a result, kernel panic occurs.
      
      By this patch, all entries of wq_numa_possible_cpumask are allocated by
      zalloc_cpumask_var_node to initialize them. And the panic disappeared.
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      Fixes: bce90380 ("workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]")
      5a6024f1
  7. 02 7月, 2014 2 次提交
    • L
      workqueue: stronger test in process_one_work() · 85327af6
      Lai Jiangshan 提交于
      When POOL_DISASSOCIATED is cleared, the running worker's local CPU should
      be the same as pool->cpu without any exception even during cpu-hotplug.
      
      This patch changes "(proposition_A && proposition_B && proposition_C)"
      to "(proposition_B && proposition_C)", so if the old compound
      proposition is true, the new one must be true too. so this won't hide
      any possible bug which can be hit by old test.
      
      tj: Minor description update and dropped the obvious comment.
      
      CC: Jason J. Herne <jjherne@linux.vnet.ibm.com>
      CC: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      85327af6
    • L
      workqueue: clear POOL_DISASSOCIATED in rebind_workers() · 3de5e884
      Lai Jiangshan 提交于
      a9ab775b ("workqueue: directly restore CPU affinity of workers
      from CPU_ONLINE") moved pool locking into rebind_workers() but left
      "pool->flags &= ~POOL_DISASSOCIATED" in workqueue_cpu_up_callback().
      
      There is nothing necessarily wrong with it, but there is no benefit
      either.  Let's move it into rebind_workers() and achieve the following
      benefits:
      
        1) better readability, POOL_DISASSOCIATED is cleared in rebind_workers()
           as expected.
      
        2) we can guarantee that, when POOL_DISASSOCIATED is clear, the
           running workers of the pool are on the local CPU (pool->cpu).
      
      tj: Minor description update.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      3de5e884
  8. 24 6月, 2014 1 次提交
  9. 20 6月, 2014 8 次提交
    • L
      workqueue: stronger test in process_one_work() · 807407c0
      Lai Jiangshan 提交于
      After the recent changes, when POOL_DISASSOCIATED is cleared, the
      running worker's local CPU should be the same as pool->cpu without any
      exception even during cpu-hotplug.  Update the sanity check in
      process_one_work() accordingly.
      
      This patch changes "(proposition_A && proposition_B && proposition_C)"
      to "(proposition_B && proposition_C)", so if the old compound
      proposition is true, the new one must be true too. so this will not
      hide any possible bug which can be caught by the old test.
      
      tj: Minor updates to the description.
      
      CC: Jason J. Herne <jjherne@linux.vnet.ibm.com>
      CC: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      807407c0
    • L
      workqueue: clear POOL_DISASSOCIATED in rebind_workers() · f05b558d
      Lai Jiangshan 提交于
      The commit a9ab775b ("workqueue: directly restore CPU affinity of
      workers from CPU_ONLINE") moved the pool->lock into rebind_workers()
      without also moving "pool->flags &= ~POOL_DISASSOCIATED".
      
      There is nothing wrong with "pool->flags &= ~POOL_DISASSOCIATED" not
      being moved together, but there isn't any benefit either. We move it
      into rebind_workers() and achieve these benefits:
      
      1) Better readability.  POOL_DISASSOCIATED is cleared in
         rebind_workers() as expected.
      
      2) When POOL_DISASSOCIATED is cleared, we can ensure that all the
         running workers of the pool are on the local CPU (pool->cpu).
      
      tj: Cosmetic updates to the code and description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      f05b558d
    • L
      workqueue: sanity check pool->cpu in wq_worker_sleeping() · 92b69f50
      Lai Jiangshan 提交于
      In theory, pool->cpu is equals to @cpu in wq_worker_sleeping() after
      worker->flags is checked.
      
      And "pool->cpu != cpu" sanity check will help us if something wrong.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      92b69f50
    • L
      workqueue: clear leftover flags when detached · b62c0751
      Lai Jiangshan 提交于
      When a worker is detached, the worker->flags may still have WORKER_UNBOUND
      or WORKER_REBOUND, it is OK for all cases:
        1) if it is a normal worker, the worker will be dead, it is OK.
        2) if it is a rescuer, it may re-attach to a pool with this leftover flag[s],
           it is still correct except it may cause unneeded wakeup.
      
      It is correct but not good, so we just remove the leftover flags.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b62c0751
    • L
      workqueue: remove useless WARN_ON_ONCE() · 25ef0958
      Lai Jiangshan 提交于
      The @cpu is fetched via smp_processor_id() in this function,
      so the check is useless.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      25ef0958
    • L
      workqueue: use schedule_timeout_interruptible() instead of open code · e212f361
      Lai Jiangshan 提交于
      schedule_timeout_interruptible(CREATE_COOLDOWN) is exactly the same as
      the original code.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e212f361
    • L
      workqueue: remove the empty check in too_many_workers() · e6a9a771
      Lai Jiangshan 提交于
      The commit ea1abd61 ("workqueue: reimplement idle worker rebinding")
      used a trick which simply removes all to-be-bound idle workers from the
      idle list and lets them add themselves back after completing rebinding.
      
      And this trick caused the @worker_pool->nr_idle may deviate than the actual
      number of idle workers on @worker_pool->idle_list.  More specifically,
      nr_idle may be non-zero while ->idle_list is empty.  All users of
      ->nr_idle and ->idle_list are audited.  The only affected one is
      too_many_workers() which is updated to check %false if ->idle_list is
      empty regardless of ->nr_idle.
      
      The commit/trick was complicated due to it just tried to simplify an even
      more complicated problem (workers had to rebind itself). But the commit
      a9ab775b ("workqueue: directly restore CPU affinity of workers
      from CPU_ONLINE") fixed all these problems and the mentioned trick was
      useless and is gone.
      
      So, now the @worker_pool->nr_idle is exactly the actual number of workers
      on @worker_pool->idle_list. too_many_workers() should recover as it was
      before the trick. So we remove the empty check.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e6a9a771
    • L
      workqueue: use "pool->cpu < 0" to stand for an unbound pool · 61d0fbb4
      Lai Jiangshan 提交于
      There is a piece of sanity checks code in the put_unbound_pool().
      The meaning of this code is "if it is not an unbound pool, it will complain
      and return" IIUC. But the code uses "pool->flags & POOL_DISASSOCIATED"
      imprecisely due to a non-unbound pool may also have this flags.
      
      We should use "pool->cpu < 0" to stand for an unbound pool, so we covert the
      code to it.
      
      There is no strictly wrong if we still keep "pool->flags & POOL_DISASSOCIATED"
      here, but it is just a noise if we keep it:
        1) we focus on "unbound" here, not "[dis]association".
        2) "pool->cpu < 0" already implies "pool->flags & POOL_DISASSOCIATED".
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      61d0fbb4
  10. 28 5月, 2014 1 次提交
  11. 22 5月, 2014 2 次提交
    • L
      workqueue: remove the confusing POOL_FREEZING · 74b414ea
      Lai Jiangshan 提交于
      Currently, the global freezing state is propagated to worker_pools via
      POOL_FREEZING and then to each workqueue; however, the middle step -
      propagation through worker_pools - can be skipped as long as one or
      more max_active adjustments happens for each workqueue after the
      update to the global state is visible.  The global workqueue freezing
      state and the max_active adjustments during workqueue creation and
      [un]freezing are serialized with wq_pool_mutex, so it's trivial to
      guarantee that max_actives stay in sync with global freezing state.
      
      POOL_FREEZING is unnecessary and makes the code more confusing and
      complicates freeze_workqueues_begin() and thaw_workqueues() by
      requiring them to walk through all pools.
      
      Remove POOL_FREEZING and use workqueue_freezing directly instead.
      
      tj: Description and comment updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      74b414ea
    • L
      workqueue: rename first_worker() to first_idle_worker() · 1037de36
      Lai Jiangshan 提交于
      first_worker() actually returns the first idle workers, the name
      first_idle_worker() which is self-commnet will be better.
      
      All the callers of first_worker() expect it returns an idle worker,
      the name first_idle_worker() with "idle" notation makes reviewers happier.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1037de36
  12. 20 5月, 2014 10 次提交
    • L
      workqueue: use generic attach/detach routine for rescuers · 51697d39
      Lai Jiangshan 提交于
      There are several problems with the code that rescuers use to bind
      themselve to the target pool's cpumask.
      
        1) It is very different from how the normal workers bind to cpumask,
           increasing code complexity and maintenance overhead.
      
        2) The code of cpu-binding for rescuers is complicated.
      
        3) If one or more cpu hotplugs happen while a rescuer is processing
           its scheduled work items, the rescuer may not stay bound to the
           cpumask of the pool. This is an allowed behavior, but is still
           hairy. It will be better if the cpumask of the rescuer is always
           kept synchronized with the pool across cpu hotplugs.
      
      Using generic attach/detach routine will solve the above problems and
      results in much simpler code.
      
      tj: Minor description updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      51697d39
    • L
      workqueue: separate pool-attaching code out from create_worker() · 4736cbf7
      Lai Jiangshan 提交于
      Currently, the code to attach a new worker to its pool is embedded in
      create_worker().  Separating this code out will make the codes clearer
      and will allow rescuers to share the code path later.
      
      tj: Description and comment updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4736cbf7
    • L
      workqueue: rename manager_mutex to attach_mutex · 92f9c5c4
      Lai Jiangshan 提交于
      manager_mutex is only used to protect the attaching for the pool
      and the pool->workers list. It protects the pool->workers and operations
      based on this list, such as:
      
      	cpu-binding for the workers in the pool->workers
      	the operations to set/clear WORKER_UNBOUND
      
      So let's rename manager_mutex to attach_mutex to better reflect its
      role. This patch is a pure rename.
      
      tj: Minor command and description updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      92f9c5c4
    • L
      workqueue: narrow the protection range of manager_mutex · 4d757c5c
      Lai Jiangshan 提交于
      In create_worker(), as pool->worker_ida now uses
      ida_simple_get()/ida_simple_put() and doesn't require external
      synchronization, it doesn't need manager_mutex.
      
      struct worker allocation and kthread allocation are not visible by any
      one before attached, so they don't need manager_mutex either.
      
      The above operations are before the attaching operation which attaches
      the worker to the pool. Between attaching and starting the worker, the
      worker is already attached to the pool, so the cpu hotplug will handle
      cpu-binding for the worker correctly and we don't need the
      manager_mutex after attaching.
      
      The conclusion is that only the attaching operation needs manager_mutex,
      so we narrow the protection section of manager_mutex in create_worker().
      
      Some comments about manager_mutex are removed, because we will rename
      it to attach_mutex and add worker_attach_to_pool() later which will be
      self-explanatory.
      
      tj: Minor description updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4d757c5c
    • L
      workqueue: convert worker_idr to worker_ida · 7cda9aae
      Lai Jiangshan 提交于
      We no longer iterate workers via worker_idr and worker_idr is used
      only for allocating/freeing ID, so we can convert it to worker_ida.
      
      By using ida_simple_get/remove(), worker_ida doesn't require external
      synchronization, so we don't need manager_mutex to protect it and the
      ID-removal code is allowed to be moved out from
      worker_detach_from_pool().
      
      In a later patch, worker_detach_from_pool() will be used in rescuers
      which don't have IDs, so we move the ID-removal code out from
      worker_detach_from_pool() into worker_thread().
      
      tj: Minor description updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7cda9aae
    • L
      workqueue: separate iteration role from worker_idr · da028469
      Lai Jiangshan 提交于
      worker_idr has the iteration (iterating for attached workers) and
      worker ID duties. These two duties don't have to be tied together. We
      can separate them and use a list for tracking attached workers and
      iteration.
      
      Before this separation, it wasn't possible to add rescuer workers to
      worker_idr due to rescuer workers couldn't allocate ID dynamically
      because ID-allocation depends on memory-allocation, which rescuer
      can't depend on.
      
      After separation, we can easily add the rescuer workers to the list for
      iteration without any memory-allocation. It is required when we attach
      the rescuer worker to the pool in later patch.
      
      tj: Minor description updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      da028469
    • L
      workqueue: destroy worker directly in the idle timeout handler · 3347fc9f
      Lai Jiangshan 提交于
      Since destroy_worker() doesn't need to sleep nor require manager_mutex,
      destroy_worker() can be directly called in the idle timeout
      handler, it helps us remove POOL_MANAGE_WORKERS and
      maybe_destroy_worker() and simplify the manage_workers()
      
      After POOL_MANAGE_WORKERS is removed, worker_thread() doesn't
      need to test whether it needs to manage after processed works.
      So we can remove the test branch.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      3347fc9f
    • L
      workqueue: async worker destruction · 60f5a4bc
      Lai Jiangshan 提交于
      worker destruction includes these parts of code:
      	adjust pool's stats
      	remove the worker from idle list
      	detach the worker from the pool
      	kthread_stop() to wait for the worker's task exit
      	free the worker struct
      
      We can find out that there is no essential work to do after
      kthread_stop(), which means destroy_worker() doesn't need to wait for
      the worker's task exit, so we can remove kthread_stop() and free the
      worker struct in the worker exiting path.
      
      However, put_unbound_pool() still needs to sync the all the workers'
      destruction before destroying the pool; otherwise, the workers may
      access to the invalid pool when they are exiting.
      
      So we also move the code of "detach the worker" to the exiting
      path and let put_unbound_pool() to sync with this code via
      detach_completion.
      
      The code of "detach the worker" is wrapped in a new function
      "worker_detach_from_pool()" although worker_detach_from_pool() is only
      called once (in worker_thread()) after this patch, but we need to wrap
      it for these reasons:
      
        1) The code of "detach the worker" is not short enough to unfold them
           in worker_thread().
        2) the name of "worker_detach_from_pool()" is self-comment, and we add
           some comments above the function.
        3) it will be shared by rescuer in later patch which allows rescuer
           and normal thread use the same attach/detach frameworks.
      
      The worker id is freed when detaching which happens before the worker
      is fully dead, but this id of the dying worker may be re-used for a
      new worker, so the dying worker's task name is changed to
      "worker/dying" to avoid two or several workers having the same name.
      
      Since "detach the worker" is moved out from destroy_worker(),
      destroy_worker() doesn't require manager_mutex, so the
      "lockdep_assert_held(&pool->manager_mutex)" in destroy_worker() is
      removed, and destroy_worker() is not protected by manager_mutex in
      put_unbound_pool().
      
      tj: Minor description updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      60f5a4bc
    • L
      workqueue: destroy_worker() should destroy idle workers only · 73eb7fe7
      Lai Jiangshan 提交于
      We used to have the CPU online failure path where a worker is created
      and then destroyed without being started. A worker was created for
      the CPU coming online and if the online operation failed the created worker
      was shut down without being started.  But this behavior was changed.
      The first worker is created and started at the same time for the CPU coming
      online.
      
      It means that we had already ensured in the code that destroy_worker()
      destroys only idle workers and we don't want to allow it to destroy
      any non-idle worker in the future. Otherwise, it may be buggy and it
      may be extremely hard to check. We should force destroy_worker() to
      destroy only idle workers explicitly.
      
      Since destroy_worker() destroys only idle workers, this patch does not
      change any functionality. We just need to update the comments and the
      sanity check code.
      
      In the sanity check code, we will refuse to destroy the worker
      if !(worker->flags & WORKER_IDLE).
      
      If the worker entered idle which means it is already started,
      so we remove the check of "worker->flags & WORKER_STARTED",
      after this removal, WORKER_STARTED is totally unneeded,
      so we remove WORKER_STARTED too.
      
      In the comments for create_worker(), "Create a new worker which is bound..."
      is changed to "... which is attached..." due to we change the name of this
      behavior to attaching.
      
      tj: Minor description / comment updates.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      73eb7fe7
    • L
      workqueue: use manager lock only to protect worker_idr · 9625ab17
      Lai Jiangshan 提交于
      worker_idr is highly bound to managers and is always/only accessed in manager
      lock context. So we don't need pool->lock for it.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      9625ab17
  13. 13 5月, 2014 1 次提交
  14. 19 4月, 2014 2 次提交
    • D
      workqueue: simplify wq_update_unbound_numa() by jumping to use_dfl_pwq if the... · 534a3fbb
      Daeseok Youn 提交于
      workqueue: simplify wq_update_unbound_numa() by jumping to use_dfl_pwq if the target cpumask equals wq's
      
      wq_update_unbound_numa(), when it's decided that the newly updated
      cpumask equals the default, looks at whether the current pwq is
      already the default one and skips setting pwq to the default one.
      This extra step is unnecessary and we can always jump to use_dfl_pwq
      instead. Simplify the code by removing the conditional.
      This doesn't make any functional difference.
      Signed-off-by: NDaeseok Youn <daeseok.youn@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      534a3fbb
    • L
      workqueue: fix a possible race condition between rescuer and pwq-release · 77668c8b
      Lai Jiangshan 提交于
      There is a race condition between rescuer_thread() and
      pwq_unbound_release_workfn().
      
      Even after a pwq is scheduled for rescue, the associated work items
      may be consumed by any worker.  If all of them are consumed before the
      rescuer gets to them and the pwq's base ref was put due to attribute
      change, the pwq may be released while still being linked on
      @wq->maydays list making the rescuer dereference already freed pwq
      later.
      
      Make send_mayday() pin the target pwq until the rescuer is done with
      it.
      
      tj: Updated comment and patch description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org # v3.10+
      77668c8b
  15. 18 4月, 2014 2 次提交