1. 13 3月, 2013 16 次提交
    • T
      workqueue: introduce workqueue_attrs · 7a4e344c
      Tejun Heo 提交于
      Introduce struct workqueue_attrs which carries worker attributes -
      currently the nice level and allowed cpumask along with helper
      routines alloc_workqueue_attrs() and free_workqueue_attrs().
      
      Each worker_pool now carries ->attrs describing the attributes of its
      workers.  All functions dealing with cpumask and nice level of workers
      are updated to follow worker_pool->attrs instead of determining them
      from other characteristics of the worker_pool, and init_workqueues()
      is updated to set worker_pool->attrs appropriately for all standard
      pools.
      
      Note that create_worker() is updated to always perform set_user_nice()
      and use set_cpus_allowed_ptr() combined with manual assertion of
      PF_THREAD_BOUND instead of kthread_bind().  This simplifies handling
      random attributes without affecting the outcome.
      
      This patch doesn't introduce any behavior changes.
      
      v2: Missing cpumask_var_t definition caused build failure on some
          archs.  linux/cpumask.h included.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      7a4e344c
    • T
      workqueue: separate out init_worker_pool() from init_workqueues() · 4e1a1f9a
      Tejun Heo 提交于
      This will be used to implement unbound pools with custom attributes.
      
      This patch doesn't introduce any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      4e1a1f9a
    • T
      workqueue: replace POOL_MANAGING_WORKERS flag with worker_pool->manager_arb · 34a06bd6
      Tejun Heo 提交于
      POOL_MANAGING_WORKERS is used to synchronize the manager role.
      Synchronizing among workers doesn't need blocking and that's why it's
      implemented as a flag.
      
      It got converted to a mutex a while back to add blocking wait from CPU
      hotplug path - 60373152 ("workqueue: use mutex for global_cwq
      manager exclusion").  Later it turned out that synchronization among
      workers and cpu hotplug need to be done separately.  Eventually,
      POOL_MANAGING_WORKERS is restored and workqueue->manager_mutex got
      morphed into workqueue->assoc_mutex - 552a37e9 ("workqueue: restore
      POOL_MANAGING_WORKERS") and b2eb83d1 ("workqueue: rename
      manager_mutex to assoc_mutex").
      
      Now, we're gonna need to be able to lock out managers from
      destroy_workqueue() to support multiple unbound pools with custom
      attributes making it again necessary to be able to block on the
      manager role.  This patch replaces POOL_MANAGING_WORKERS with
      worker_pool->manager_arb.
      
      This patch doesn't introduce any behavior changes.
      
      v2: s/manager_mutex/manager_arb/
      Signed-off-by: NTejun Heo <tj@kernel.org>
      34a06bd6
    • T
      workqueue: update synchronization rules on worker_pool_idr · fa1b54e6
      Tejun Heo 提交于
      Make worker_pool_idr protected by workqueue_lock for writes and
      sched-RCU protected for reads.  Lockdep assertions are added to
      for_each_pool() and get_work_pool() and all their users are converted
      to either hold workqueue_lock or disable preemption/irq.
      
      worker_pool_assign_id() is updated to hold workqueue_lock when
      allocating a pool ID.  As idr_get_new() always performs RCU-safe
      assignment, this is enough on the writer side.
      
      As standard pools are never destroyed, there's nothing to do on that
      side.
      
      The locking is superflous at this point.  This is to help
      implementation of unbound pools/pwqs with custom attributes.
      
      This patch doesn't introduce any behavior changes.
      
      v2: Updated for_each_pwq() use if/else for the hidden assertion
          statement instead of just if as suggested by Lai.  This avoids
          confusing the following else clause.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      fa1b54e6
    • T
      workqueue: update synchronization rules on workqueue->pwqs · 76af4d93
      Tejun Heo 提交于
      Make workqueue->pwqs protected by workqueue_lock for writes and
      sched-RCU protected for reads.  Lockdep assertions are added to
      for_each_pwq() and first_pwq() and all their users are converted to
      either hold workqueue_lock or disable preemption/irq.
      
      alloc_and_link_pwqs() is updated to use list_add_tail_rcu() for
      consistency which isn't strictly necessary as the workqueue isn't
      visible.  destroy_workqueue() isn't updated to sched-RCU release pwqs.
      This is okay as the workqueue should have on users left by that point.
      
      The locking is superflous at this point.  This is to help
      implementation of unbound pools/pwqs with custom attributes.
      
      This patch doesn't introduce any behavior changes.
      
      v2: Updated for_each_pwq() use if/else for the hidden assertion
          statement instead of just if as suggested by Lai.  This avoids
          confusing the following else clause.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      76af4d93
    • T
      workqueue: replace get_pwq() with explicit per_cpu_ptr() accesses and first_pwq() · 7fb98ea7
      Tejun Heo 提交于
      get_pwq() takes @cpu, which can also be WORK_CPU_UNBOUND, and @wq and
      returns the matching pwq (pool_workqueue).  We want to move away from
      using @cpu for identifying pools and pwqs for unbound pools with
      custom attributes and there is only one user - workqueue_congested() -
      which makes use of the WQ_UNBOUND conditional in get_pwq().  All other
      users already know whether they're dealing with a per-cpu or unbound
      workqueue.
      
      Replace get_pwq() with explicit per_cpu_ptr(wq->cpu_pwqs, cpu) for
      per-cpu workqueues and first_pwq() for unbound ones, and open-code
      WQ_UNBOUND conditional in workqueue_congested().
      
      Note that this makes workqueue_congested() behave sligntly differently
      when @cpu other than WORK_CPU_UNBOUND is specified.  It ignores @cpu
      for unbound workqueues and always uses the first pwq instead of
      oopsing.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      7fb98ea7
    • T
      workqueue: remove workqueue_struct->pool_wq.single · 420c0ddb
      Tejun Heo 提交于
      workqueue->pool_wq union is used to point either to percpu pwqs
      (pool_workqueues) or single unbound pwq.  As the first pwq can be
      accessed via workqueue->pwqs list, there's no reason for the single
      pointer anymore.
      
      Use list_first_entry(workqueue->pwqs) to access the unbound pwq and
      drop workqueue->pool_wq.single pointer and the pool_wq union.  It
      simplifies the code and eases implementing multiple unbound pools w/
      custom attributes.
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      420c0ddb
    • T
      workqueue: consistently use int for @cpu variables · d84ff051
      Tejun Heo 提交于
      Workqueue is mixing unsigned int and int for @cpu variables.  There's
      no point in using unsigned int for cpus - many of cpu related APIs
      take int anyway.  Consistently use int for @cpu variables so that we
      can use negative values to mark special ones.
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      d84ff051
    • T
      workqueue: add wokrqueue_struct->maydays list to replace mayday cpu iterators · 493a1724
      Tejun Heo 提交于
      Similar to how pool_workqueue iteration used to be, raising and
      servicing mayday requests is based on CPU numbers.  It's hairy because
      cpumask_t may not be able to handle WORK_CPU_UNBOUND and cpumasks are
      assumed to be always set on UP.  This is ugly and can't handle
      multiple unbound pools to be added for unbound workqueues w/ custom
      attributes.
      
      Add workqueue_struct->maydays.  When a pool_workqueue needs rescuing,
      it gets chained on the list through pool_workqueue->mayday_node and
      rescuer_thread() consumes the list until it's empty.
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      493a1724
    • T
      workqueue: restructure pool / pool_workqueue iterations in freeze/thaw functions · 24b8a847
      Tejun Heo 提交于
      The three freeze/thaw related functions - freeze_workqueues_begin(),
      freeze_workqueues_busy() and thaw_workqueues() - need to iterate
      through all pool_workqueues of all freezable workqueues.  They did it
      by first iterating pools and then visiting all pwqs (pool_workqueues)
      of all workqueues and process it if its pwq->pool matches the current
      pool.  This is rather backwards and done this way partly because
      workqueue didn't have fitting iteration helpers and partly to avoid
      the number of lock operations on pool->lock.
      
      Workqueue now has fitting iterators and the locking operation overhead
      isn't anything to worry about - those locks are unlikely to be
      contended and the same CPU visiting the same set of locks multiple
      times isn't expensive.
      
      Restructure the three functions such that the flow better matches the
      logical steps and pwq iteration is done using for_each_pwq() inside
      workqueue iteration.
      
      * freeze_workqueues_begin(): Setting of FREEZING is moved into a
        separate for_each_pool() iteration.  pwq iteration for clearing
        max_active is updated as described above.
      
      * freeze_workqueues_busy(): pwq iteration updated as described above.
      
      * thaw_workqueues(): The single for_each_wq_cpu() iteration is broken
        into three discrete steps - clearing FREEZING, restoring max_active,
        and kicking workers.  The first and last steps use for_each_pool()
        and the second step uses pwq iteration described above.
      
      This makes the code easier to understand and removes the use of
      for_each_wq_cpu() for walking pwqs, which can't support multiple
      unbound pwqs which will be needed to implement unbound workqueues with
      custom attributes.
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      24b8a847
    • T
      workqueue: introduce for_each_pool() · 17116969
      Tejun Heo 提交于
      With the scheduled unbound pools with custom attributes, there will be
      multiple unbound pools, so it wouldn't be able to use
      for_each_wq_cpu() + for_each_std_worker_pool() to iterate through all
      pools.
      
      Introduce for_each_pool() which iterates through all pools using
      worker_pool_idr and use it instead of for_each_wq_cpu() +
      for_each_std_worker_pool() combination in freeze_workqueues_begin().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      17116969
    • T
      workqueue: replace for_each_pwq_cpu() with for_each_pwq() · 49e3cf44
      Tejun Heo 提交于
      Introduce for_each_pwq() which iterates all pool_workqueues of a
      workqueue using the recently added workqueue->pwqs list and replace
      for_each_pwq_cpu() usages with it.
      
      This is primarily to remove the single unbound CPU assumption from pwq
      iteration for the scheduled unbound pools with custom attributes
      support which would introduce multiple unbound pwqs per workqueue;
      however, it also simplifies iterator users.
      
      Note that pwq->pool initialization is moved to alloc_and_link_pwqs()
      as that now is the only place which is explicitly handling the two pwq
      types.
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      49e3cf44
    • T
      workqueue: add workqueue_struct->pwqs list · 30cdf249
      Tejun Heo 提交于
      Add workqueue_struct->pwqs list and chain all pool_workqueues
      belonging to a workqueue there.  This will be used to implement
      generic pool_workqueue iteration and handle multiple pool_workqueues
      for the scheduled unbound pools with custom attributes.
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      30cdf249
    • T
      workqueue: introduce kmem_cache for pool_workqueues · e904e6c2
      Tejun Heo 提交于
      pool_workqueues need to be aligned to 1 << WORK_STRUCT_FLAG_BITS as
      the lower bits of work->data are used for flags when they're pointing
      to pool_workqueues.
      
      Due to historical reasons, unbound pool_workqueues are allocated using
      kzalloc() with sufficient buffer area for alignment and aligned
      manually.  The original pointer is stored at the end which free_pwqs()
      retrieves when freeing it.
      
      There's no reason for this hackery anymore.  Set alignment of struct
      pool_workqueue to 1 << WORK_STRUCT_FLAG_BITS, add kmem_cache for
      pool_workqueues with proper alignment and replace the hacky alloc and
      free implementation with plain kmem_cache_zalloc/free().
      
      In case WORK_STRUCT_FLAG_BITS gets shrunk too much and makes fields of
      pool_workqueues misaligned, trigger WARN if the alignment of struct
      pool_workqueue becomes smaller than that of long long.
      
      Note that assertion on IS_ALIGNED() is removed from alloc_pwqs().  We
      already have another one in pwq init loop in __alloc_workqueue_key().
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      e904e6c2
    • T
      workqueue: make workqueue_lock irq-safe · e98d5b16
      Tejun Heo 提交于
      workqueue_lock will be used to synchronize areas which require
      irq-safety and there isn't much benefit in keeping it not irq-safe.
      Make it irq-safe.
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      e98d5b16
    • T
      workqueue: make sanity checks less punshing using WARN_ON[_ONCE]()s · 6183c009
      Tejun Heo 提交于
      Workqueue has been using mostly BUG_ON()s for sanity checks, which
      fail unnecessarily harshly when the assertion doesn't hold.  Most
      assertions can converted to be less drastic such that things can limp
      along instead of dying completely.  Convert BUG_ON()s to
      WARN_ON[_ONCE]()s with softer failure behaviors - e.g. if assertion
      check fails in destroy_worker(), trigger WARN and silently ignore
      destruction request.
      
      Most conversions are trivial.  Note that sanity checks in
      destroy_workqueue() are moved above removal from workqueues list so
      that it can bail out without side-effects if assertion checks fail.
      
      This patch doesn't introduce any visible behavior changes during
      normal operation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      6183c009
  2. 05 3月, 2013 3 次提交
  3. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  4. 20 2月, 2013 1 次提交
  5. 14 2月, 2013 3 次提交
    • T
      workqueue: rename cpu_workqueue to pool_workqueue · 112202d9
      Tejun Heo 提交于
      workqueue has moved away from global_cwqs to worker_pools and with the
      scheduled custom worker pools, wforkqueues will be associated with
      pools which don't have anything to do with CPUs.  The workqueue code
      went through significant amount of changes recently and mass renaming
      isn't likely to hurt much additionally.  Let's replace 'cpu' with
      'pool' so that it reflects the current design.
      
      * s/struct cpu_workqueue_struct/struct pool_workqueue/
      * s/cpu_wq/pool_wq/
      * s/cwq/pwq/
      
      This patch is purely cosmetic.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      112202d9
    • T
      workqueue: reimplement is_chained_work() using current_wq_worker() · 8d03ecfe
      Tejun Heo 提交于
      is_chained_work() was added before current_wq_worker() and implemented
      its own ham-fisted way of finding out whether %current is a workqueue
      worker - it iterates through all possible workers.
      
      Drop the custom implementation and reimplement using
      current_wq_worker().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      8d03ecfe
    • T
      workqueue: fix is_chained_work() regression · 1dd63814
      Tejun Heo 提交于
      c9e7cf27 ("workqueue: move busy_hash from global_cwq to
      worker_pool") incorrectly converted is_chained_work() to use
      get_gcwq() inside for_each_gcwq_cpu() while removing get_gcwq().
      
      As cwq might not exist for all possible workqueue CPUs, @cwq can be
      NULL and the following cwq deferences can lead to oops.
      
      Fix it by using for_each_cwq_cpu() instead, which is the better one to
      use anyway as we only need to check pools that the wq is associated
      with.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1dd63814
  6. 08 2月, 2013 3 次提交
    • L
      workqueue: pick cwq instead of pool in __queue_work() · 8594fade
      Lai Jiangshan 提交于
      Currently, __queue_work() chooses the pool to queue a work item to and
      then determines cwq from the target wq and the chosen pool.  This is a
      bit backwards in that we can determine cwq first and simply use
      cwq->pool.  This way, we can skip get_std_worker_pool() in queueing
      path which will be a hurdle when implementing custom worker pools.
      
      Update __queue_work() such that it chooses the target cwq and then use
      cwq->pool instead of the other way around.  While at it, add missing
      {} in an if statement.
      
      This patch doesn't introduce any functional changes.
      
      tj: The original patch had two get_cwq() calls - the first to
          determine the pool by doing get_cwq(cpu, wq)->pool and the second
          to determine the matching cwq from get_cwq(pool->cpu, wq).
          Updated the function such that it chooses cwq instead of pool and
          removed the second call.  Rewrote the description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      8594fade
    • L
      workqueue: make get_work_pool_id() cheaper · 54d5b7d0
      Lai Jiangshan 提交于
      get_work_pool_id() currently first obtains pool using get_work_pool()
      and then return pool->id.  For an off-queue work item, this involves
      obtaining pool ID from worker->data, performing idr_find() to find the
      matching pool and then returning its pool->id which of course is the
      same as the one which went into idr_find().
      
      Just open code WORK_STRUCT_CWQ case and directly return pool ID from
      work->data.
      
      tj: The original patch dropped on-queue work item handling and renamed
          the function to offq_work_pool_id().  There isn't much benefit in
          doing so.  Handling it only requires a single if() and we need at
          least BUG_ON(), which is also a branch, even if we drop on-queue
          handling.  Open code WORK_STRUCT_CWQ case and keep the function in
          line with get_work_pool().  Rewrote the description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      54d5b7d0
    • T
      workqueue: move nr_running into worker_pool · e19e397a
      Tejun Heo 提交于
      As nr_running is likely to be accessed from other CPUs during
      try_to_wake_up(), it was kept outside worker_pool; however, while less
      frequent, other fields in worker_pool are accessed from other CPUs
      for, e.g., non-reentrancy check.  Also, with recent pool related
      changes, accessing nr_running matching the worker_pool isn't as simple
      as it used to be.
      
      Move nr_running inside worker_pool.  Keep it aligned to cacheline and
      define CPU pools using DEFINE_PER_CPU_SHARED_ALIGNED().  This should
      give at least the same cacheline behavior.
      
      get_pool_nr_running() is replaced with direct pool->nr_running
      accesses.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      e19e397a
  7. 07 2月, 2013 6 次提交
    • T
      workqueue: cosmetic update in try_to_grab_pending() · 16062836
      Tejun Heo 提交于
      With the recent is-work-queued-here test simplification, the nested
      if() in try_to_grab_pending() can be collapsed.  Collapse it.
      
      This patch is purely cosmetic.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      16062836
    • L
      workqueue: simplify is-work-item-queued-here test · 0b3dae68
      Lai Jiangshan 提交于
      Currently, determining whether a work item is queued on a locked pool
      involves somewhat convoluted memory barrier dancing.  It goes like the
      following.
      
      * When a work item is queued on a pool, work->data is updated before
        work->entry is linked to the pending list with a wmb() inbetween.
      
      * When trying to determine whether a work item is currently queued on
        a pool pointed to by work->data, it locks the pool and looks at
        work->entry.  If work->entry is linked, we then do rmb() and then
        check whether work->data points to the current pool.
      
      This works because, work->data can only point to a pool if it
      currently is or were on the pool and,
      
      * If it currently is on the pool, the tests would obviously succeed.
      
      * It it left the pool, its work->entry was cleared under pool->lock,
        so if we're seeing non-empty work->entry, it has to be from the work
        item being linked on another pool.  Because work->data is updated
        before work->entry is linked with wmb() inbetween, work->data update
        from another pool is guaranteed to be visible if we do rmb() after
        seeing non-empty work->entry.  So, we either see empty work->entry
        or we see updated work->data pointin to another pool.
      
      While this works, it's convoluted, to put it mildly.  With recent
      updates, it's now guaranteed that work->data points to cwq only while
      the work item is queued and that updating work->data to point to cwq
      or back to pool is done under pool->lock, so we can simply test
      whether work->data points to cwq which is associated with the
      currently locked pool instead of the convoluted memory barrier
      dancing.
      
      This patch replaces the memory barrier based "are you still here,
      really?" test with much simpler "does work->data points to me?" test -
      if work->data points to a cwq which is associated with the currently
      locked pool, the work item is guaranteed to be queued on the pool as
      work->data can start and stop pointing to such cwq only under
      pool->lock and the start and stop coincide with queue and dequeue.
      
      tj: Rewrote the comments and description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      0b3dae68
    • L
      workqueue: make work->data point to pool after try_to_grab_pending() · 4468a00f
      Lai Jiangshan 提交于
      We plan to use work->data pointing to cwq as the synchronization
      invariant when determining whether a given work item is on a locked
      pool or not, which requires work->data pointing to cwq only while the
      work item is queued on the associated pool.
      
      With delayed_work updated not to overload work->data for target
      workqueue recording, the only case where we still have off-queue
      work->data pointing to cwq is try_to_grab_pending() which doesn't
      update work->data after stealing a queued work item.  There's no
      reason for try_to_grab_pending() to not update work->data to point to
      the pool instead of cwq, like the normal execution does.
      
      This patch adds set_work_pool_and_keep_pending() which makes
      work->data point to pool instead of cwq but keeps the pending bit
      unlike set_work_pool_and_clear_pending() (surprise!).
      
      After this patch, it's guaranteed that only queued work items point to
      cwqs.
      
      This patch doesn't introduce any visible behavior change.
      
      tj: Renamed the new helper function to match
          set_work_pool_and_clear_pending() and rewrote the description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4468a00f
    • L
      workqueue: add delayed_work->wq to simplify reentrancy handling · 60c057bc
      Lai Jiangshan 提交于
      To avoid executing the same work item from multiple CPUs concurrently,
      a work_struct records the last pool it was on in its ->data so that,
      on the next queueing, the pool can be queried to determine whether the
      work item is still executing or not.
      
      A delayed_work goes through timer before actually being queued on the
      target workqueue and the timer needs to know the target workqueue and
      CPU.  This is currently achieved by modifying delayed_work->work.data
      such that it points to the cwq which points to the target workqueue
      and the last CPU the work item was on.  __queue_delayed_work()
      extracts the last CPU from delayed_work->work.data and then combines
      it with the target workqueue to create new work.data.
      
      The only thing this rather ugly hack achieves is encoding the target
      workqueue into delayed_work->work.data without using a separate field,
      which could be a trade off one can make; unfortunately, this entangles
      work->data management between regular workqueue and delayed_work code
      by setting cwq pointer before the work item is actually queued and
      becomes a hindrance for further improvements of work->data handling.
      
      This can be easily made sane by adding a target workqueue field to
      delayed_work.  While delayed_work is used widely in the kernel and
      this does make it a bit larger (<5%), I think this is the right
      trade-off especially given the prospect of much saner handling of
      work->data which currently involves quite tricky memory barrier
      dancing, and don't expect to see any measureable effect.
      
      Add delayed_work->wq and drop the delayed_work->work.data overloading.
      
      tj: Rewrote the description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      60c057bc
    • L
      workqueue: make work_busy() test WORK_STRUCT_PENDING first · 038366c5
      Lai Jiangshan 提交于
      Currently, work_busy() first tests whether the work has a pool
      associated with it and if not, considers it idle.  This works fine
      even for delayed_work.work queued on timer, as __queue_delayed_work()
      sets cwq on delayed_work.work - a queued delayed_work always has its
      cwq and thus pool associated with it.
      
      However, we're about to update delayed_work queueing and this won't
      hold.  Update work_busy() such that it tests WORK_STRUCT_PENDING
      before the associated pool.  This doesn't make any noticeable behavior
      difference now.
      
      With work_pending() test moved, the function read a lot better with
      "if (!pool)" test flipped to positive.  Flip it.
      
      While at it, lose the comment about now non-existent reentrant
      workqueues.
      
      tj: Reorganized the function and rewrote the description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      038366c5
    • L
      workqueue: replace WORK_CPU_NONE/LAST with WORK_CPU_END · 6be19588
      Lai Jiangshan 提交于
      Now that workqueue has moved away from gcwqs, workqueue no longer has
      the need to have a CPU identifier indicating "no cpu associated" - we
      now use WORK_OFFQ_POOL_NONE instead - and most uses of WORK_CPU_NONE
      are gone.
      
      The only left usage is as the end marker for for_each_*wq*()
      iterators, where the name WORK_CPU_NONE is confusing w/o actual
      WORK_CPU_NONE usages.  Similarly, WORK_CPU_LAST which equals
      WORK_CPU_NONE no longer makes sense.
      
      Replace both WORK_CPU_NONE and LAST with WORK_CPU_END.  This patch
      doesn't introduce any functional difference.
      
      tj: s/WORK_CPU_LAST/WORK_CPU_END/ and rewrote the description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6be19588
  8. 25 1月, 2013 7 次提交
    • T
      workqueue: post global_cwq removal cleanups · 706026c2
      Tejun Heo 提交于
      Remove remaining references to gcwq.
      
      * __next_gcwq_cpu() steals __next_wq_cpu() name.  The original
        __next_wq_cpu() became __next_cwq_cpu().
      
      * s/for_each_gcwq_cpu/for_each_wq_cpu/
        s/for_each_online_gcwq_cpu/for_each_online_wq_cpu/
      
      * s/gcwq_mayday_timeout/pool_mayday_timeout/
      
      * s/gcwq_unbind_fn/wq_unbind_fn/
      
      * Drop references to gcwq in comments.
      
      This patch doesn't introduce any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      706026c2
    • T
      workqueue: rename nr_running variables · e6e380ed
      Tejun Heo 提交于
      Rename per-cpu and unbound nr_running variables such that they match
      the pool variables.
      
      This patch doesn't introduce any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      e6e380ed
    • T
      workqueue: remove global_cwq · a60dc39c
      Tejun Heo 提交于
      global_cwq is now nothing but a container for per-cpu standard
      worker_pools.  Declare the worker pools directly as
      cpu/unbound_std_worker_pools[] and remove global_cwq.
      
      * ____cacheline_aligned_in_smp moved from global_cwq to worker_pool.
        This probably would have made sense even before this change as we
        want each pool to be aligned.
      
      * get_gcwq() is replaced with std_worker_pools() which returns the
        pointer to the standard pool array for a given CPU.
      
      * __alloc_workqueue_key() updated to use get_std_worker_pool() instead
        of open-coding pool determination.
      
      This is part of an effort to remove global_cwq and make worker_pool
      the top level abstraction, which in turn will help implementing worker
      pools with user-specified attributes.
      
      v2: Joonsoo pointed out that it'd better to align struct worker_pool
          rather than the array so that every pool is aligned.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      a60dc39c
    • T
      workqueue: remove worker_pool->gcwq · 4e8f0a60
      Tejun Heo 提交于
      The only remaining user of pool->gcwq is std_worker_pool_pri().
      Reimplement it using get_gcwq() and remove worker_pool->gcwq.
      
      This is part of an effort to remove global_cwq and make worker_pool
      the top level abstraction, which in turn will help implementing worker
      pools with user-specified attributes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      4e8f0a60
    • T
      workqueue: replace for_each_worker_pool() with for_each_std_worker_pool() · 38db41d9
      Tejun Heo 提交于
      for_each_std_worker_pool() takes @cpu instead of @gcwq.
      
      This is part of an effort to remove global_cwq and make worker_pool
      the top level abstraction, which in turn will help implementing worker
      pools with user-specified attributes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      38db41d9
    • T
      workqueue: make freezing/thawing per-pool · a1056305
      Tejun Heo 提交于
      Instead of holding locks from both pools and then processing the pools
      together, make freezing/thwaing per-pool - grab locks of one pool,
      process it, release it and then proceed to the next pool.
      
      While this patch changes processing order across pools, order within
      each pool remains the same.  As each pool is independent, this
      shouldn't break anything.
      
      This is part of an effort to remove global_cwq and make worker_pool
      the top level abstraction, which in turn will help implementing worker
      pools with user-specified attributes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      a1056305
    • T
      workqueue: make hotplug processing per-pool · 94cf58bb
      Tejun Heo 提交于
      Instead of holding locks from both pools and then processing the pools
      together, make hotplug processing per-pool - grab locks of one pool,
      process it, release it and then proceed to the next pool.
      
      rebind_workers() is updated to take and process @pool instead of @gcwq
      which results in a lot of de-indentation.  gcwq_claim_assoc_and_lock()
      and its counterpart are replaced with in-line per-pool locking.
      
      While this patch changes processing order across pools, order within
      each pool remains the same.  As each pool is independent, this
      shouldn't break anything.
      
      This is part of an effort to remove global_cwq and make worker_pool
      the top level abstraction, which in turn will help implementing worker
      pools with user-specified attributes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      94cf58bb