1. 29 6月, 2010 6 次提交
    • T
      workqueue: reimplement workqueue flushing using color coded works · 73f53c4a
      Tejun Heo 提交于
      Reimplement workqueue flushing using color coded works.  wq has the
      current work color which is painted on the works being issued via
      cwqs.  Flushing a workqueue is achieved by advancing the current work
      colors of cwqs and waiting for all the works which have any of the
      previous colors to drain.
      
      Currently there are 16 possible colors, one is reserved for no color
      and 15 colors are useable allowing 14 concurrent flushes.  When color
      space gets full, flush attempts are batched up and processed together
      when color frees up, so even with many concurrent flushers, the new
      implementation won't build up huge queue of flushers which has to be
      processed one after another.
      
      Only works which are queued via __queue_work() are colored.  Works
      which are directly put on queue using insert_work() use NO_COLOR and
      don't participate in workqueue flushing.  Currently only works used
      for work-specific flush fall in this category.
      
      This new implementation leaves only cleanup_workqueue_thread() as the
      user of flush_cpu_workqueue().  Just make its users use
      flush_workqueue() and kthread_stop() directly and kill
      cleanup_workqueue_thread().  As workqueue flushing doesn't use barrier
      request anymore, the comment describing the complex synchronization
      around it in cleanup_workqueue_thread() is removed together with the
      function.
      
      This new implementation is to allow having and sharing multiple
      workers per cpu.
      
      Please note that one more bit is reserved for a future work flag by
      this patch.  This is to avoid shifting bits and updating comments
      later.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      73f53c4a
    • T
      workqueue: update cwq alignement · 0f900049
      Tejun Heo 提交于
      work->data field is used for two purposes.  It points to cwq it's
      queued on and the lower bits are used for flags.  Currently, two bits
      are reserved which is always safe as 4 byte alignment is guaranteed on
      every architecture.  However, future changes will need more flag bits.
      
      On SMP, the percpu allocator is capable of honoring larger alignment
      (there are other users which depend on it) and larger alignment works
      just fine.  On UP, percpu allocator is a thin wrapper around
      kzalloc/kfree() and don't honor alignment request.
      
      This patch introduces WORK_STRUCT_FLAG_BITS and implements
      alloc/free_cwqs() which guarantees max(1 << WORK_STRUCT_FLAG_BITS,
      __alignof__(unsigned long long) alignment both on SMP and UP.  On SMP,
      simply wrapping percpu allocator is enough.  On UP, extra space is
      allocated so that cwq can be aligned and the original pointer can be
      stored after it which is used in the free path.
      
      * Alignment problem on UP is reported by Michal Simek.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Reported-by: NMichal Simek <michal.simek@petalogix.com>
      0f900049
    • T
      workqueue: define masks for work flags and conditionalize STATIC flags · 22df02bb
      Tejun Heo 提交于
      Work flags are about to see more traditional mask handling.  Define
      WORK_STRUCT_*_BIT as the bit position constant and redefine
      WORK_STRUCT_* as bit masks.  Also, make WORK_STRUCT_STATIC_* flags
      conditional
      
      While at it, re-define these constants as enums and use
      WORK_STRUCT_STATIC instead of hard-coding 2 in
      WORK_DATA_STATIC_INIT().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      22df02bb
    • T
      workqueue: merge feature parameters into flags · 97e37d7b
      Tejun Heo 提交于
      Currently, __create_workqueue_key() takes @singlethread and
      @freezeable paramters and store them separately in workqueue_struct.
      Merge them into a single flags parameter and field and use
      WQ_FREEZEABLE and WQ_SINGLE_THREAD.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      97e37d7b
    • T
      workqueue: misc/cosmetic updates · 4690c4ab
      Tejun Heo 提交于
      Make the following updates in preparation of concurrency managed
      workqueue.  None of these changes causes any visible behavior
      difference.
      
      * Add comments and adjust indentations to data structures and several
        functions.
      
      * Rename wq_per_cpu() to get_cwq() and swap the position of two
        parameters for consistency.  Convert a direct per_cpu_ptr() access
        to wq->cpu_wq to get_cwq().
      
      * Add work_static() and Update set_wq_data() such that it sets the
        flags part to WORK_STRUCT_PENDING | WORK_STRUCT_STATIC if static |
        @extra_flags.
      
      * Move santiy check on work->entry emptiness from queue_work_on() to
        __queue_work() which all queueing paths share.
      
      * Make __queue_work() take @cpu and @wq instead of @cwq.
      
      * Restructure flush_work() and __create_workqueue_key() to make them
        easier to modify.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4690c4ab
    • T
      workqueue: kill RT workqueue · c790bce0
      Tejun Heo 提交于
      With stop_machine() converted to use cpu_stop, RT workqueue doesn't
      have any user left.  Kill RT workqueue support.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      c790bce0
  2. 16 11月, 2009 1 次提交
  3. 15 10月, 2009 2 次提交
  4. 21 9月, 2009 1 次提交
  5. 06 9月, 2009 1 次提交
  6. 03 4月, 2009 1 次提交
  7. 22 1月, 2009 1 次提交
  8. 12 1月, 2009 1 次提交
  9. 06 11月, 2008 1 次提交
    • R
      cpumask: introduce new API, without changing anything · 2d3854a3
      Rusty Russell 提交于
      Impact: introduce new APIs
      
      We want to deprecate cpumasks on the stack, as we are headed for
      gynormous numbers of CPUs.  Eventually, we want to head towards an
      undefined 'struct cpumask' so they can never be declared on stack.
      
      1) New cpumask functions which take pointers instead of copies.
         (cpus_* -> cpumask_*)
      
      2) Several new helpers to reduce requirements for temporary cpumasks
         (cpumask_first_and, cpumask_next_and, cpumask_any_and)
      
      3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
         (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)
      
      4) 'struct cpumask' for explicitness and to mark new-style code.
      
      5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
         not NR_CPUS for time efficiency and for smaller dynamic allocations
         in future.
      
      6) cpumask_copy() so we can allocate less than a full cpumask eventually
         (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
         definition eventually.
      
      7) work_on_cpu() helper for doing task on a CPU, rather than saving old
         cpumask for current thread and manipulating it.
      
      8) smp_call_function_many() which is smp_call_function_mask() except
         taking a cpumask pointer.
      
      Note that this patch simply introduces the new functions and leaves
      the obsolescent ones in place.  This is to simplify the transition
      patches.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d3854a3
  10. 22 10月, 2008 1 次提交
  11. 26 7月, 2008 1 次提交
    • O
      workqueues: implement flush_work() · db700897
      Oleg Nesterov 提交于
      Most of users of flush_workqueue() can be changed to use cancel_work_sync(),
      but sometimes we really need to wait for the completion and cancelling is not
      an option. schedule_on_each_cpu() is good example.
      
      Add the new helper, flush_work(work), which waits for the completion of the
      specific work_struct. More precisely, it "flushes" the result of of the last
      queue_work() which is visible to the caller.
      
      For example, this code
      
      	queue_work(wq, work);
      	/* WINDOW */
      	queue_work(wq, work);
      
      	flush_work(work);
      
      doesn't necessary work "as expected". What can happen in the WINDOW above is
      
      	- wq starts the execution of work->func()
      
      	- the caller migrates to another CPU
      
      now, after the 2nd queue_work() this work is active on the previous CPU, and
      at the same time it is queued on another. In this case flush_work(work) may
      return before the first work->func() completes.
      
      It is trivial to add another helper
      
      	int flush_work_sync(struct work_struct *work)
      	{
      		return flush_work(work) || wait_on_work(work);
      	}
      
      which works "more correctly", but it has to iterate over all CPUs and thus
      it much slower than flush_work().
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NMax Krasnyansky <maxk@qualcomm.com>
      Acked-by: NJarek Poplawski <jarkao2@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      db700897
  12. 25 7月, 2008 1 次提交
  13. 14 2月, 2008 1 次提交
  14. 16 1月, 2008 1 次提交
    • J
      lockdep: fix workqueue creation API lockdep interaction · eb13ba87
      Johannes Berg 提交于
      Dave Young reported warnings from lockdep that the workqueue API
      can sometimes try to register lockdep classes with the same key
      but different names. This is not permitted in lockdep.
      
      Unfortunately, I was unaware of that restriction when I wrote
      the code to debug workqueue problems with lockdep and used the
      workqueue name as the lockdep class name. This can obviously
      lead to the problem if the workqueue name is dynamic.
      
      This patch solves the problem by always using a constant name
      for the workqueue's lockdep class, namely either the constant
      name that was passed in or a string consisting of the variable
      name.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      eb13ba87
  15. 20 10月, 2007 1 次提交
  16. 17 7月, 2007 2 次提交
  17. 18 5月, 2007 1 次提交
  18. 17 5月, 2007 1 次提交
  19. 10 5月, 2007 5 次提交
    • O
      unify flush_work/flush_work_keventd and rename it to cancel_work_sync · 28e53bdd
      Oleg Nesterov 提交于
      flush_work(wq, work) doesn't need the first parameter, we can use cwq->wq
      (this was possible from the very beginnig, I missed this).  So we can unify
      flush_work_keventd and flush_work.
      
      Also, rename flush_work() to cancel_work_sync() and fix all callers.
      Perhaps this is not the best name, but "flush_work" is really bad.
      
      (akpm: this is why the earlier patches bypassed maintainers)
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Jeff Garzik <jeff@garzik.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Auke Kok <auke-jan.h.kok@intel.com>,
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      28e53bdd
    • O
      workqueue: kill NOAUTOREL works · 23b2e599
      Oleg Nesterov 提交于
      We don't have any users, and it is not so trivial to use NOAUTOREL works
      correctly.  It is better to simplify API.
      
      Delete NOAUTOREL support and rename work_release to work_clear_pending to
      avoid a confusion.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      23b2e599
    • O
      make cancel_rearming_delayed_work() work on any workqueue, not just keventd_wq · 1634c48f
      Oleg Nesterov 提交于
      cancel_rearming_delayed_workqueue(wq, dwork) doesn't need the first
      parameter.  We don't hang on un-queued dwork any longer, and work->data
      doesn't change its type.  This means we can always figure out "wq" from
      dwork when it is needed.
      
      Remove this parameter, and rename the function to
      cancel_rearming_delayed_work().  Re-create an inline "obsolete"
      cancel_rearming_delayed_workqueue(wq) which just calls
      cancel_rearming_delayed_work().
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1634c48f
    • O
      workqueue: kill run_scheduled_work() · 7097a87a
      Oleg Nesterov 提交于
      Because it has no callers.
      
      Actually, I think the whole idea of run_scheduled_work() was not right, not
      good to mix "unqueue this work and execute its ->func()" in one function.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7097a87a
    • O
      implement flush_work() · b89deed3
      Oleg Nesterov 提交于
      A basic problem with flush_scheduled_work() is that it blocks behind _all_
      presently-queued works, rather than just the work whcih the caller wants to
      flush.  If the caller holds some lock, and if one of the queued work happens
      to want that lock as well then accidental deadlocks can occur.
      
      One example of this is the phy layer: it wants to flush work while holding
      rtnl_lock().  But if a linkwatch event happens to be queued, the phy code will
      deadlock because the linkwatch callback function takes rtnl_lock.
      
      So we implement a new function which will flush a *single* work - just the one
      which the caller wants to free up.  Thus we avoid the accidental deadlocks
      which can arise from unrelated subsystems' callbacks taking shared locks.
      
      flush_work() non-blockingly dequeues the work_struct which we want to kill,
      then it waits for its handler to complete on all CPUs.
      
      Add ->current_work to the "struct cpu_workqueue_struct", it points to
      currently running "struct work_struct". When flush_work(work) detects
      ->current_work == work, it inserts a barrier at the _head_ of ->worklist
      (and thus right _after_ that work) and waits for completition. This means
      that the next work fired on that CPU will be this barrier, or another
      barrier queued by concurrent flush_work(), so the caller of flush_work()
      will be woken before any "regular" work has a chance to run.
      
      When wait_on_work() unlocks workqueue_mutex (or whatever we choose to protect
      against CPU hotplug), CPU may go away. But in that case take_over_work() will
      move a barrier we queued to another CPU, it will be fired sometime, and
      wait_on_work() will be woken.
      
      Actually, we are doing cleanup_workqueue_thread()->kthread_stop() before
      take_over_work(), so cwq->thread should complete its ->worklist (and thus
      the barrier), because currently we don't check kthread_should_stop() in
      run_workqueue(). But even if we did, everything should be ok.
      
      [akpm@osdl.org: cleanup]
      [akpm@osdl.org: add flush_work_keventd() wrapper]
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b89deed3
  20. 09 5月, 2007 1 次提交
  21. 27 4月, 2007 1 次提交
    • O
      [WORKQUEUE]: cancel_delayed_work: use del_timer() instead of del_timer_sync() · 071b6386
      Oleg Nesterov 提交于
      del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
      efficient since it locks the timer unconditionally, and may wait for the
      completion of the delayed_work_timer_fn().
      
      cancel_delayed_work() == 0 means:
      
      	before this patch:
      		work->func may still be running or queued
      
      	after this patch:
      		work->func may still be running or queued, or
      		delayed_work_timer_fn->__queue_work() in progress.
      
      		The latter doesn't differ from the caller's POV,
      		delayed_work_timer_fn() is called with _PENDING
      		bit set.
      
      cancel_delayed_work() == 1 with this patch adds a new possibility:
      
      	delayed_work->work was cancelled, but delayed_work_timer_fn
      	is still running (this is only possible for the re-arming
      	works on single-threaded workqueue).
      
      	In this case the timer was re-started by work->func(), nobody
      	else can do this. This in turn means that delayed_work_timer_fn
      	has already passed __queue_work() (and wont't touch delayed_work)
      	because nobody else can queue delayed_work->work.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      071b6386
  22. 17 12月, 2006 1 次提交
    • L
      Make workqueue bit operations work on "atomic_long_t" · a08727ba
      Linus Torvalds 提交于
      On architectures where the atomicity of the bit operations is handled by
      external means (ie a separate spinlock to protect concurrent accesses),
      just doing a direct assignment on the workqueue data field (as done by
      commit 4594bf15) can cause the
      assignment to be lost due to lack of serialization with the bitops on
      the same word.
      
      So we need to serialize the assignment with the locks on those
      architectures (notably older ARM chips, PA-RISC and sparc32).
      
      So rather than using an "unsigned long", let's use "atomic_long_t",
      which already has a safe assignment operation (atomic_long_set()) on
      such architectures.
      
      This requires that the atomic operations use the same atomicity locks as
      the bit operations do, but that is largely the case anyway.  Sparc32
      will probably need fixing.
      
      Architectures (including modern ARM with LL/SC) that implement sane
      atomic operations for SMP won't see any of this matter.
      
      Cc: Russell King <rmk+lkml@arm.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David Miller <davem@davemloft.com>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Linux Arch Maintainers <linux-arch@vger.kernel.org>
      Cc: Andrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a08727ba
  23. 16 12月, 2006 1 次提交
  24. 08 12月, 2006 2 次提交
  25. 22 11月, 2006 4 次提交
    • D
      WorkStruct: Pass the work_struct pointer instead of context data · 65f27f38
      David Howells 提交于
      Pass the work_struct pointer to the work function rather than context data.
      The work function can use container_of() to work out the data.
      
      For the cases where the container of the work_struct may go away the moment the
      pending bit is cleared, it is made possible to defer the release of the
      structure by deferring the clearing of the pending bit.
      
      To make this work, an extra flag is introduced into the management side of the
      work_struct.  This governs auto-release of the structure upon execution.
      
      Ordinarily, the work queue executor would release the work_struct for further
      scheduling or deallocation by clearing the pending bit prior to jumping to the
      work function.  This means that, unless the driver makes some guarantee itself
      that the work_struct won't go away, the work function may not access anything
      else in the work_struct or its container lest they be deallocated..  This is a
      problem if the auxiliary data is taken away (as done by the last patch).
      
      However, if the pending bit is *not* cleared before jumping to the work
      function, then the work function *may* access the work_struct and its container
      with no problems.  But then the work function must itself release the
      work_struct by calling work_release().
      
      In most cases, automatic release is fine, so this is the default.  Special
      initiators exist for the non-auto-release case (ending in _NAR).
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      65f27f38
    • D
      WorkStruct: Merge the pending bit into the wq_data pointer · 365970a1
      David Howells 提交于
      Reclaim a word from the size of the work_struct by folding the pending bit and
      the wq_data pointer together.  This shouldn't cause misalignment problems as
      all pointers should be at least 4-byte aligned.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      365970a1
    • D
      WorkStruct: Typedef the work function prototype · 6bb49e59
      David Howells 提交于
      Define a type for the work function prototype.  It's not only kept in the
      work_struct struct, it's also passed as an argument to several functions.
      
      This makes it easier to change it.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      6bb49e59
    • D
      WorkStruct: Separate delayable and non-delayable events. · 52bad64d
      David Howells 提交于
      Separate delayable work items from non-delayable work items be splitting them
      into a separate structure (delayed_work), which incorporates a work_struct and
      the timer_list removed from work_struct.
      
      The work_struct struct is huge, and this limits it's usefulness.  On a 64-bit
      architecture it's nearly 100 bytes in size.  This reduces that by half for the
      non-delayable type of event.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      52bad64d