1. 29 6月, 2010 16 次提交
    • T
      workqueue: reimplement CPU hotplugging support using trustee · db7bccf4
      Tejun Heo 提交于
      Reimplement CPU hotplugging support using trustee thread.  On CPU
      down, a trustee thread is created and each step of CPU down is
      executed by the trustee and workqueue_cpu_callback() simply drives and
      waits for trustee state transitions.
      
      CPU down operation no longer waits for works to be drained but trustee
      sticks around till all pending works have been completed.  If CPU is
      brought back up while works are still draining,
      workqueue_cpu_callback() tells trustee to step down and tell workers
      to rebind to the cpu.
      
      As it's difficult to tell whether cwqs are empty if it's freezing or
      frozen, trustee doesn't consider draining to be complete while a gcwq
      is freezing or frozen (tracked by new GCWQ_FREEZING flag).  Also,
      workers which get unbound from their cpu are marked with WORKER_ROGUE.
      
      Trustee based implementation doesn't bring any new feature at this
      point but it will be used to manage worker pool when dynamic shared
      worker pool is implemented.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      db7bccf4
    • T
      workqueue: implement worker states · c8e55f36
      Tejun Heo 提交于
      Implement worker states.  After created, a worker is STARTED.  While a
      worker isn't processing a work, it's IDLE and chained on
      gcwq->idle_list.  While processing a work, a worker is BUSY and
      chained on gcwq->busy_hash.  Also, gcwq now counts the number of all
      workers and idle ones.
      
      worker_thread() is restructured to reflect state transitions.
      cwq->more_work is removed and waking up a worker makes it check for
      events.  A worker is killed by setting DIE flag while it's IDLE and
      waking it up.
      
      This gives gcwq better visibility of what's going on and allows it to
      find out whether a work is executing quickly which is necessary to
      have multiple workers processing the same cwq.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      c8e55f36
    • T
      workqueue: introduce global cwq and unify cwq locks · 8b03ae3c
      Tejun Heo 提交于
      There is one gcwq (global cwq) per each cpu and all cwqs on an cpu
      point to it.  A gcwq contains a lock to be used by all cwqs on the cpu
      and an ida to give IDs to workers belonging to the cpu.
      
      This patch introduces gcwq, moves worker_ida into gcwq and make all
      cwqs on the same cpu use the cpu's gcwq->lock instead of separate
      locks.  gcwq->ida is now protected by gcwq->lock too.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      8b03ae3c
    • T
      workqueue: reimplement workqueue freeze using max_active · a0a1a5fd
      Tejun Heo 提交于
      Currently, workqueue freezing is implemented by marking the worker
      freezeable and calling try_to_freeze() from dispatch loop.
      Reimplement it using cwq->limit so that the workqueue is frozen
      instead of the worker.
      
      * workqueue_struct->saved_max_active is added which stores the
        specified max_active on initialization.
      
      * On freeze, all cwq->max_active's are quenched to zero.  Freezing is
        complete when nr_active on all cwqs reach zero.
      
      * On thaw, all cwq->max_active's are restored to wq->saved_max_active
        and the worklist is repopulated.
      
      This new implementation allows having single shared pool of workers
      per cpu.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a0a1a5fd
    • T
      workqueue: implement per-cwq active work limit · 1e19ffc6
      Tejun Heo 提交于
      Add cwq->nr_active, cwq->max_active and cwq->delayed_work.  nr_active
      counts the number of active works per cwq.  A work is active if it's
      flushable (colored) and is on cwq's worklist.  If nr_active reaches
      max_active, new works are queued on cwq->delayed_work and activated
      later as works on the cwq complete and decrement nr_active.
      
      cwq->max_active can be specified via the new @max_active parameter to
      __create_workqueue() and is set to 1 for all workqueues for now.  As
      each cwq has only single worker now, this double queueing doesn't
      cause any behavior difference visible to its users.
      
      This will be used to reimplement freeze/thaw and implement shared
      worker pool.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1e19ffc6
    • T
      workqueue: reimplement work flushing using linked works · affee4b2
      Tejun Heo 提交于
      A work is linked to the next one by having WORK_STRUCT_LINKED bit set
      and these links can be chained.  When a linked work is dispatched to a
      worker, all linked works are dispatched to the worker's newly added
      ->scheduled queue and processed back-to-back.
      
      Currently, as there's only single worker per cwq, having linked works
      doesn't make any visible behavior difference.  This change is to
      prepare for multiple shared workers per cpu.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      affee4b2
    • T
      workqueue: introduce worker · c34056a3
      Tejun Heo 提交于
      Separate out worker thread related information to struct worker from
      struct cpu_workqueue_struct and implement helper functions to deal
      with the new struct worker.  The only change which is visible outside
      is that now workqueue worker are all named "kworker/CPUID:WORKERID"
      where WORKERID is allocated from per-cpu ida.
      
      This is in preparation of concurrency managed workqueue where shared
      multiple workers would be available per cpu.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      c34056a3
    • T
      workqueue: reimplement workqueue flushing using color coded works · 73f53c4a
      Tejun Heo 提交于
      Reimplement workqueue flushing using color coded works.  wq has the
      current work color which is painted on the works being issued via
      cwqs.  Flushing a workqueue is achieved by advancing the current work
      colors of cwqs and waiting for all the works which have any of the
      previous colors to drain.
      
      Currently there are 16 possible colors, one is reserved for no color
      and 15 colors are useable allowing 14 concurrent flushes.  When color
      space gets full, flush attempts are batched up and processed together
      when color frees up, so even with many concurrent flushers, the new
      implementation won't build up huge queue of flushers which has to be
      processed one after another.
      
      Only works which are queued via __queue_work() are colored.  Works
      which are directly put on queue using insert_work() use NO_COLOR and
      don't participate in workqueue flushing.  Currently only works used
      for work-specific flush fall in this category.
      
      This new implementation leaves only cleanup_workqueue_thread() as the
      user of flush_cpu_workqueue().  Just make its users use
      flush_workqueue() and kthread_stop() directly and kill
      cleanup_workqueue_thread().  As workqueue flushing doesn't use barrier
      request anymore, the comment describing the complex synchronization
      around it in cleanup_workqueue_thread() is removed together with the
      function.
      
      This new implementation is to allow having and sharing multiple
      workers per cpu.
      
      Please note that one more bit is reserved for a future work flag by
      this patch.  This is to avoid shifting bits and updating comments
      later.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      73f53c4a
    • T
      workqueue: update cwq alignement · 0f900049
      Tejun Heo 提交于
      work->data field is used for two purposes.  It points to cwq it's
      queued on and the lower bits are used for flags.  Currently, two bits
      are reserved which is always safe as 4 byte alignment is guaranteed on
      every architecture.  However, future changes will need more flag bits.
      
      On SMP, the percpu allocator is capable of honoring larger alignment
      (there are other users which depend on it) and larger alignment works
      just fine.  On UP, percpu allocator is a thin wrapper around
      kzalloc/kfree() and don't honor alignment request.
      
      This patch introduces WORK_STRUCT_FLAG_BITS and implements
      alloc/free_cwqs() which guarantees max(1 << WORK_STRUCT_FLAG_BITS,
      __alignof__(unsigned long long) alignment both on SMP and UP.  On SMP,
      simply wrapping percpu allocator is enough.  On UP, extra space is
      allocated so that cwq can be aligned and the original pointer can be
      stored after it which is used in the free path.
      
      * Alignment problem on UP is reported by Michal Simek.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Reported-by: NMichal Simek <michal.simek@petalogix.com>
      0f900049
    • T
      workqueue: kill cpu_populated_map · 1537663f
      Tejun Heo 提交于
      Worker management is about to be overhauled.  Simplify things by
      removing cpu_populated_map, creating workers for all possible cpus and
      making single threaded workqueues behave more like multi threaded
      ones.
      
      After this patch, all cwqs are always initialized, all workqueues are
      linked on the workqueues list and workers for all possibles cpus
      always exist.  This also makes CPU hotplug support simpler - checking
      ->cpus_allowed before processing works in worker_thread() and flushing
      cwqs on CPU_POST_DEAD are enough.
      
      While at it, make get_cwq() always return the cwq for the specified
      cpu, add target_cwq() for cases where single thread distinction is
      necessary and drop all direct usage of per_cpu_ptr() on wq->cpu_wq.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1537663f
    • T
      workqueue: temporarily remove workqueue tracing · 64166699
      Tejun Heo 提交于
      Strip tracing code from workqueue and remove workqueue tracing.  This
      is temporary measure till concurrency managed workqueue is complete.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      64166699
    • T
      workqueue: separate out process_one_work() · a62428c0
      Tejun Heo 提交于
      Separate out process_one_work() out of run_workqueue().  This patch
      doesn't cause any behavior change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a62428c0
    • T
      workqueue: define masks for work flags and conditionalize STATIC flags · 22df02bb
      Tejun Heo 提交于
      Work flags are about to see more traditional mask handling.  Define
      WORK_STRUCT_*_BIT as the bit position constant and redefine
      WORK_STRUCT_* as bit masks.  Also, make WORK_STRUCT_STATIC_* flags
      conditional
      
      While at it, re-define these constants as enums and use
      WORK_STRUCT_STATIC instead of hard-coding 2 in
      WORK_DATA_STATIC_INIT().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      22df02bb
    • T
      workqueue: merge feature parameters into flags · 97e37d7b
      Tejun Heo 提交于
      Currently, __create_workqueue_key() takes @singlethread and
      @freezeable paramters and store them separately in workqueue_struct.
      Merge them into a single flags parameter and field and use
      WQ_FREEZEABLE and WQ_SINGLE_THREAD.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      97e37d7b
    • T
      workqueue: misc/cosmetic updates · 4690c4ab
      Tejun Heo 提交于
      Make the following updates in preparation of concurrency managed
      workqueue.  None of these changes causes any visible behavior
      difference.
      
      * Add comments and adjust indentations to data structures and several
        functions.
      
      * Rename wq_per_cpu() to get_cwq() and swap the position of two
        parameters for consistency.  Convert a direct per_cpu_ptr() access
        to wq->cpu_wq to get_cwq().
      
      * Add work_static() and Update set_wq_data() such that it sets the
        flags part to WORK_STRUCT_PENDING | WORK_STRUCT_STATIC if static |
        @extra_flags.
      
      * Move santiy check on work->entry emptiness from queue_work_on() to
        __queue_work() which all queueing paths share.
      
      * Make __queue_work() take @cpu and @wq instead of @cwq.
      
      * Restructure flush_work() and __create_workqueue_key() to make them
        easier to modify.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4690c4ab
    • T
      workqueue: kill RT workqueue · c790bce0
      Tejun Heo 提交于
      With stop_machine() converted to use cpu_stop, RT workqueue doesn't
      have any user left.  Kill RT workqueue support.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      c790bce0
  2. 28 5月, 2010 1 次提交
  3. 30 4月, 2010 3 次提交
  4. 18 11月, 2009 1 次提交
  5. 16 11月, 2009 1 次提交
  6. 19 10月, 2009 1 次提交
    • A
      HWPOISON: Allow schedule_on_each_cpu() from keventd · 65a64464
      Andi Kleen 提交于
      Right now when calling schedule_on_each_cpu() from keventd there
      is a deadlock because it tries to schedule a work item on the current CPU
      too. This happens via lru_add_drain_all() in hwpoison.
      
      Just call the function for the current CPU in this case. This is actually
      faster too.
      
      Debugging with Fengguang Wu & Max Asbock
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      65a64464
  7. 15 10月, 2009 2 次提交
  8. 09 9月, 2009 1 次提交
  9. 04 8月, 2009 1 次提交
  10. 02 6月, 2009 1 次提交
    • Z
      ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro · fb39125f
      Zhaolei 提交于
      v3: zhaolei@cn.fujitsu.com: Change TRACE_EVENT definition to new format
          introduced by Steven Rostedt: consolidate trace and trace_event headers
      v2: kosaki@jp.fujitsu.com: print the function names instead of addr, and zap
          the work addr
      v1: zhaolei@cn.fujitsu.com: Make workqueue tracepoints use TRACE_EVENT macro
      
      TRACE_EVENT is a more generic way to define tracepoints.
      Doing so adds these new capabilities to the tracepoints:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
      
      Then, this patch converts DEFINE_TRACE to TRACE_EVENT in workqueue related
      tracepoints.
      
      [ Impact: expand workqueue tracer to events tracing ]
      Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      fb39125f
  11. 09 4月, 2009 1 次提交
    • A
      work_on_cpu(): rewrite it to create a kernel thread on demand · 6b44003e
      Andrew Morton 提交于
      Impact: circular locking bugfix
      
      The various implemetnations and proposed implemetnations of work_on_cpu()
      are vulnerable to various deadlocks because they all used queues of some
      form.
      
      Unrelated pieces of kernel code thus gained dependencies wherein if one
      work_on_cpu() caller holds a lock which some other work_on_cpu() callback
      also takes, the kernel could rarely deadlock.
      
      Fix this by creating a short-lived kernel thread for each work_on_cpu()
      invokation.
      
      This is not terribly fast, but the only current caller of work_on_cpu() is
      pci_call_probe().
      
      It would be nice to find some other way of doing the node-local
      allocations in the PCI probe code so that we can zap work_on_cpu()
      altogether.  The code there is rather nasty.  I can't think of anything
      simple at this time...
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      6b44003e
  12. 03 4月, 2009 1 次提交
  13. 30 3月, 2009 1 次提交
  14. 20 1月, 2009 2 次提交
  15. 17 1月, 2009 2 次提交
  16. 14 1月, 2009 1 次提交
    • F
      tracing: add a new workqueue tracer · e1d8aa9f
      Frederic Weisbecker 提交于
      Impact: new tracer
      
      The workqueue tracer provides some statistical informations
      about each cpu workqueue thread such as the number of the
      works inserted and executed since their creation. It can help
      to evaluate the amount of work each of them have to perform.
      For example it can help a developer to decide whether he should
      choose a per cpu workqueue instead of a singlethreaded one.
      
      It only traces statistical informations for now but it will probably later
      provide event tracing too.
      
      Such a tracer could help too, and be improved, to help rt priority sorted
      workqueue development.
      
      To have a snapshot of the workqueues state at any time, just do
      
      cat /debugfs/tracing/trace_stat/workqueues
      
      Ie:
      
        1    125        125       reiserfs/1
        1      0          0       scsi_tgtd/1
        1      0          0       aio/1
        1      0          0       ata/1
        1    114        114       kblockd/1
        1      0          0       kintegrityd/1
        1   2147       2147       events/1
      
        0      0          0       kpsmoused
        0    105        105       reiserfs/0
        0      0          0       scsi_tgtd/0
        0      0          0       aio/0
        0      0          0       ata_aux
        0      0          0       ata/0
        0      0          0       cqueue
        0      0          0       kacpi_notify
        0      0          0       kacpid
        0    149        149       kblockd/0
        0      0          0       kintegrityd/0
        0   1000       1000       khelper
        0   2270       2270       events/0
      
      Changes in V2:
      
      _ Drop the static array based on NR_CPU and dynamically allocate the stat array
        with num_possible_cpus() and other cpu mask facilities....
      _ Trace workqueue insertion at a bit lower level (insert_work instead of queue_work) to handle
        even the workqueue barriers.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e1d8aa9f
  17. 01 1月, 2009 1 次提交
  18. 14 11月, 2008 1 次提交
  19. 06 11月, 2008 1 次提交
    • R
      cpumask: introduce new API, without changing anything · 2d3854a3
      Rusty Russell 提交于
      Impact: introduce new APIs
      
      We want to deprecate cpumasks on the stack, as we are headed for
      gynormous numbers of CPUs.  Eventually, we want to head towards an
      undefined 'struct cpumask' so they can never be declared on stack.
      
      1) New cpumask functions which take pointers instead of copies.
         (cpus_* -> cpumask_*)
      
      2) Several new helpers to reduce requirements for temporary cpumasks
         (cpumask_first_and, cpumask_next_and, cpumask_any_and)
      
      3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
         (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)
      
      4) 'struct cpumask' for explicitness and to mark new-style code.
      
      5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
         not NR_CPUS for time efficiency and for smaller dynamic allocations
         in future.
      
      6) cpumask_copy() so we can allocate less than a full cpumask eventually
         (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
         definition eventually.
      
      7) work_on_cpu() helper for doing task on a CPU, rather than saving old
         cpumask for current thread and manipulating it.
      
      8) smp_call_function_many() which is smp_call_function_mask() except
         taking a cpumask pointer.
      
      Note that this patch simply introduces the new functions and leaves
      the obsolescent ones in place.  This is to simplify the transition
      patches.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d3854a3
  20. 22 10月, 2008 1 次提交