提交 · cd42d559e45e3563c74403e453f8954b593db69d · openeuler / Kernel

02 4月, 2013 1 次提交

workqueue: update sysfs interface to reflect NUMA awareness and a kernel param... · d55262c4

由 Tejun Heo 提交于 4月 01, 2013

workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity

Unbound workqueues are now NUMA aware.  Let's add some control knobs
and update sysfs interface accordingly.

* Add kernel param workqueue.numa_disable which disables NUMA affinity
  globally.

* Replace sysfs file "pool_id" with "pool_ids" which contain
  node:pool_id pairs.  This change is userland-visible but "pool_id"
  hasn't seen a release yet, so this is okay.

* Add a new sysf files "numa" which can toggle NUMA affinity on
  individual workqueues.  This is implemented as attrs->no_numa whichn
  is special in that it isn't part of a pool's attributes.  It only
  affects how apply_workqueue_attrs() picks which pools to use.

After "pool_ids" change, first_pwq() doesn't have any user left.
Removed.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

d55262c4

14 3月, 2013 1 次提交

workqueue: inline trivial wrappers · 8425e3d5

由 Tejun Heo 提交于 3月 13, 2013

There's no reason to make these trivial wrappers full (exported)
functions.  Inline the followings.

 queue_work()
 queue_delayed_work()
 mod_delayed_work()
 schedule_work_on()
 schedule_work()
 schedule_delayed_work_on()
 schedule_delayed_work()
 keventd_up()
Signed-off-by: NTejun Heo <tj@kernel.org>

8425e3d5

13 3月, 2013 8 次提交

workqueue: implement current_is_workqueue_rescuer() · e6267616

由 Tejun Heo 提交于 3月 12, 2013

Implement a function which queries whether it currently is running off
a workqueue rescuer.  This will be used to convert writeback to
workqueue.
Signed-off-by: NTejun Heo <tj@kernel.org>

e6267616

workqueue: implement sysfs interface for workqueues · 226223ab

由 Tejun Heo 提交于 3月 12, 2013

There are cases where workqueue users want to expose control knobs to
userland.  e.g. Unbound workqueues with custom attributes are
scheduled to be used for writeback workers and depending on
configuration it can be useful to allow admins to tinker with the
priority or allowed CPUs.

This patch implements workqueue_sysfs_register(), which makes the
workqueue visible under /sys/bus/workqueue/devices/WQ_NAME.  There
currently are two attributes common to both per-cpu and unbound pools
and extra attributes for unbound pools including nice level and
cpumask.

If alloc_workqueue*() is called with WQ_SYSFS,
workqueue_sysfs_register() is called automatically as part of
workqueue creation.  This is the preferred method unless the workqueue
user wants to apply workqueue_attrs before making the workqueue
visible to userland.

v2: Disallow exposing ordered workqueues as ordered workqueues can't
    be tuned in any way.
Signed-off-by: NTejun Heo <tj@kernel.org>

226223ab

workqueue: reject adjusting max_active or applying attrs to ordered workqueues · 8719dcea

由 Tejun Heo 提交于 3月 12, 2013

Adjusting max_active of or applying new workqueue_attrs to an ordered
workqueue breaks its ordering guarantee.  The former is obvious.  The
latter is because applying attrs creates a new pwq (pool_workqueue)
and there is no ordering constraint between the old and new pwqs.

Make apply_workqueue_attrs() and workqueue_set_max_active() trigger
WARN_ON() if those operations are requested on an ordered workqueue
and fail / ignore respectively.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

8719dcea

workqueue: make it clear that WQ_DRAINING is an internal flag · 618b01eb

由 Tejun Heo 提交于 3月 12, 2013

We're gonna add another internal WQ flag.  Let's make the distinction
clear.  Prefix WQ_DRAINING with __ and move it to bit 16.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

618b01eb

workqueue: implement apply_workqueue_attrs() · 9e8cd2f5

由 Tejun Heo 提交于 3月 12, 2013

Implement apply_workqueue_attrs() which applies workqueue_attrs to the
specified unbound workqueue by creating a new pwq (pool_workqueue)
linked to worker_pool with the specified attributes.

A new pwq is linked at the head of wq->pwqs instead of tail and
__queue_work() verifies that the first unbound pwq has positive refcnt
before choosing it for the actual queueing.  This is to cover the case
where creation of a new pwq races with queueing.  As base ref on a pwq
won't be dropped without making another pwq the first one,
__queue_work() is guaranteed to make progress and not add work item to
a dead pwq.

init_and_link_pwq() is updated to return the last first pwq the new
pwq replaced, which is put by apply_workqueue_attrs().

Note that apply_workqueue_attrs() is almost identical to unbound pwq
part of alloc_and_link_pwqs().  The only difference is that there is
no previous first pwq.  apply_workqueue_attrs() is implemented to
handle such cases and replaces unbound pwq handling in
alloc_and_link_pwqs().
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

9e8cd2f5

workqueue: drop WQ_RESCUER and test workqueue->rescuer for NULL instead · 493008a8

由 Tejun Heo 提交于 3月 12, 2013

WQ_RESCUER is superflous.  WQ_MEM_RECLAIM indicates that the user
wants a rescuer and testing wq->rescuer for NULL can answer whether a
given workqueue has a rescuer or not.  Drop WQ_RESCUER and test
wq->rescuer directly.

This will help simplifying __alloc_workqueue_key() failure path by
allowing it to use destroy_workqueue() on a partially constructed
workqueue, which in turn will help implementing dynamic management of
pool_workqueues.

While at it, clear wq->rescuer after freeing it in
destroy_workqueue().  This is a precaution as scheduled changes will
make destruction more complex.

This patch doesn't introduce any functional changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

493008a8

workqueue: introduce workqueue_attrs · 7a4e344c

由 Tejun Heo 提交于 3月 12, 2013

Introduce struct workqueue_attrs which carries worker attributes -
currently the nice level and allowed cpumask along with helper
routines alloc_workqueue_attrs() and free_workqueue_attrs().

Each worker_pool now carries ->attrs describing the attributes of its
workers.  All functions dealing with cpumask and nice level of workers
are updated to follow worker_pool->attrs instead of determining them
from other characteristics of the worker_pool, and init_workqueues()
is updated to set worker_pool->attrs appropriately for all standard
pools.

Note that create_worker() is updated to always perform set_user_nice()
and use set_cpus_allowed_ptr() combined with manual assertion of
PF_THREAD_BOUND instead of kthread_bind().  This simplifies handling
random attributes without affecting the outcome.

This patch doesn't introduce any behavior changes.

v2: Missing cpumask_var_t definition caused build failure on some
    archs.  linux/cpumask.h included.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

7a4e344c

workqueue: consistently use int for @cpu variables · d84ff051

由 Tejun Heo 提交于 3月 12, 2013

Workqueue is mixing unsigned int and int for @cpu variables.  There's
no point in using unsigned int for cpus - many of cpu related APIs
take int anyway.  Consistently use int for @cpu variables so that we
can use negative values to mark special ones.

This patch doesn't introduce any visible behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

d84ff051

05 3月, 2013 1 次提交

workqueue: allow more off-queue flag space · 45d9550a

由 Lai Jiangshan 提交于 2月 19, 2013

When a work item is off-queue, its work->data contains WORK_STRUCT_*
and WORK_OFFQ_* flags.  As WORK_OFFQ_* flags are used only while a
work item is off-queue, it can occupy bits of work->data which aren't
used while off-queue.  WORK_OFFQ_* currently only use bits used by
on-queue CWQ pointer.  As color bits aren't used while off-queue,
there's no reason to not use them.

Lower WORK_OFFQ_FLAG_BASE from WORK_STRUCT_FLAG_BITS to
WORK_STRUCT_COLOR_SHIFT thus giving 4 more bits to off-queue flag
space which is also used to record worker_pool ID while off-queue.

This doesn't introduce any visible behavior difference.

tj: Rewrote the description.
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

45d9550a

14 2月, 2013 1 次提交

workqueue: rename cpu_workqueue to pool_workqueue · 112202d9

由 Tejun Heo 提交于 2月 13, 2013

workqueue has moved away from global_cwqs to worker_pools and with the
scheduled custom worker pools, wforkqueues will be associated with
pools which don't have anything to do with CPUs.  The workqueue code
went through significant amount of changes recently and mass renaming
isn't likely to hurt much additionally.  Let's replace 'cpu' with
'pool' so that it reflects the current design.

* s/struct cpu_workqueue_struct/struct pool_workqueue/
* s/cpu_wq/pool_wq/
* s/cwq/pwq/

This patch is purely cosmetic.
Signed-off-by: NTejun Heo <tj@kernel.org>

112202d9

07 2月, 2013 2 次提交

workqueue: add delayed_work->wq to simplify reentrancy handling · 60c057bc

由 Lai Jiangshan 提交于 2月 06, 2013

To avoid executing the same work item from multiple CPUs concurrently,
a work_struct records the last pool it was on in its ->data so that,
on the next queueing, the pool can be queried to determine whether the
work item is still executing or not.

A delayed_work goes through timer before actually being queued on the
target workqueue and the timer needs to know the target workqueue and
CPU. This is currently achieved by modifying delayed_work->work.data
such that it points to the cwq which points to the target workqueue
and the last CPU the work item was on. __queue_delayed_work()
extracts the last CPU from delayed_work->work.data and then combines
it with the target workqueue to create new work.data.

The only thing this rather ugly hack achieves is encoding the target
workqueue into delayed_work->work.data without using a separate field,
which could be a trade off one can make; unfortunately, this entangles
work->data management between regular workqueue and delayed_work code
by setting cwq pointer before the work item is actually queued and
becomes a hindrance for further improvements of work->data handling.

This can be easily made sane by adding a target workqueue field to
delayed_work. While delayed_work is used widely in the kernel and
this does make it a bit larger (<5%), I think this is the right
trade-off especially given the prospect of much saner handling of
work->data which currently involves quite tricky memory barrier
dancing, and don't expect to see any measureable effect.

Add delayed_work->wq and drop the delayed_work->work.data overloading.

tj: Rewrote the description.
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

60c057bc

workqueue: replace WORK_CPU_NONE/LAST with WORK_CPU_END · 6be19588

由 Lai Jiangshan 提交于 2月 06, 2013

Now that workqueue has moved away from gcwqs, workqueue no longer has
the need to have a CPU identifier indicating "no cpu associated" - we
now use WORK_OFFQ_POOL_NONE instead - and most uses of WORK_CPU_NONE
are gone.

The only left usage is as the end marker for for_each_*wq*()
iterators, where the name WORK_CPU_NONE is confusing w/o actual
WORK_CPU_NONE usages.  Similarly, WORK_CPU_LAST which equals
WORK_CPU_NONE no longer makes sense.

Replace both WORK_CPU_NONE and LAST with WORK_CPU_END.  This patch
doesn't introduce any functional difference.

tj: s/WORK_CPU_LAST/WORK_CPU_END/ and rewrote the description.
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

6be19588

25 1月, 2013 3 次提交

workqueue: record pool ID instead of CPU in work->data when off-queue · 7c3eed5c

由 Tejun Heo 提交于 1月 24, 2013

Currently, when a work item is off-queue, work->data records the CPU
it was last on, which is used to locate the last executing instance
for non-reentrance, flushing, etc.

We're in the process of removing global_cwq and making worker_pool the
top level abstraction.  This patch makes work->data point to the pool
it was last associated with instead of CPU.

After the previous WORK_OFFQ_POOL_CPU and worker_poo->id additions,
the conversion is fairly straight-forward.  WORK_OFFQ constants and
functions are modified to record and read back pool ID instead.
worker_pool_by_id() is added to allow looking up pool from ID.
get_work_pool() replaces get_work_gcwq(), which is reimplemented using
get_work_pool().  get_work_pool_id() replaces work_cpu().

This patch shouldn't introduce any observable behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

7c3eed5c

workqueue: introduce WORK_OFFQ_CPU_NONE · 715b06b8

由 Tejun Heo 提交于 1月 24, 2013

Currently, when a work item is off queue, high bits of its data
encodes the last CPU it was on.  This is scheduled to be changed to
pool ID, which will make it impossible to use WORK_CPU_NONE to
indicate no association.

This patch limits the number of bits which are used for off-queue cpu
number to 31 (so that the max fits in an int) and uses the highest
possible value - WORK_OFFQ_CPU_NONE - to indicate no association.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

715b06b8

workqueue: unexport work_cpu() · e2905b29

由 Tejun Heo 提交于 1月 24, 2013

This function no longer has any external users.  Unexport it.  It will
be removed later on.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>

e2905b29

22 8月, 2012 6 次提交

workqueue: deprecate __cancel_delayed_work() · 136b5721

由 Tejun Heo 提交于 8月 21, 2012

Now that cancel_delayed_work() can be safely called from IRQ handlers,
there's no reason to use __cancel_delayed_work().  Use
cancel_delayed_work() instead of __cancel_delayed_work() and mark the
latter deprecated.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NJens Axboe <axboe@kernel.dk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Roland Dreier <roland@kernel.org>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>

136b5721

workqueue: reimplement cancel_delayed_work() using try_to_grab_pending() · 57b30ae7

由 Tejun Heo 提交于 8月 21, 2012

cancel_delayed_work() can't be called from IRQ handlers due to its use
of del_timer_sync() and can't cancel work items which are already
transferred from timer to worklist.

Also, unlike other flush and cancel functions, a canceled delayed_work
would still point to the last associated cpu_workqueue.  If the
workqueue is destroyed afterwards and the work item is re-used on a
different workqueue, the queueing code can oops trying to dereference
already freed cpu_workqueue.

This patch reimplements cancel_delayed_work() using
try_to_grab_pending() and set_work_cpu_and_clear_pending().  This
allows the function to be called from IRQ handlers and makes its
behavior consistent with other flush / cancel functions.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

57b30ae7

workqueue: use irqsafe timer for delayed_work · e0aecdd8

由 Tejun Heo 提交于 8月 21, 2012

Up to now, for delayed_works, try_to_grab_pending() couldn't be used
from IRQ handlers because IRQs may happen while
delayed_work_timer_fn() is in progress leading to indefinite -EAGAIN.

This patch makes delayed_work use the new TIMER_IRQSAFE flag for
delayed_work->timer.  This makes try_to_grab_pending() and thus
mod_delayed_work_on() safe to call from IRQ handlers.
Signed-off-by: NTejun Heo <tj@kernel.org>

e0aecdd8

workqueue: clean up delayed_work initializers and add missing one · f991b318

由 Tejun Heo 提交于 8月 21, 2012

Reimplement delayed_work initializers using new timer initializers
which take timer flags.  This reduces code duplications and will ease
further initializer changes.  This patch also adds a missing
initializer - INIT_DEFERRABLE_WORK_ONSTACK().
Signed-off-by: NTejun Heo <tj@kernel.org>

f991b318

workqueue: make deferrable delayed_work initializer names consistent · 203b42f7

由 Tejun Heo 提交于 8月 21, 2012

Initalizers for deferrable delayed_work are confused.

* __DEFERRED_WORK_INITIALIZER()
* DECLARE_DEFERRED_WORK()
* INIT_DELAYED_WORK_DEFERRABLE()

Rename them to

* __DEFERRABLE_WORK_INITIALIZER()
* DECLARE_DEFERRABLE_WORK()
* INIT_DEFERRABLE_WORK()

This patch doesn't cause any functional changes.
Signed-off-by: NTejun Heo <tj@kernel.org>

203b42f7

workqueue: cosmetic whitespace updates for macro definitions · ee64e7f6

由 Tejun Heo 提交于 8月 21, 2012

Consistently use the last tab position for '\' line continuation in
complex macro definitions.  This is to help the following patches.

This patch is cosmetic.
Signed-off-by: NTejun Heo <tj@kernel.org>

ee64e7f6

21 8月, 2012 4 次提交

workqueue: deprecate system_nrt[_freezable]_wq · 3b07e9ca

由 Tejun Heo 提交于 8月 20, 2012

system_nrt[_freezable]_wq are now spurious.  Mark them deprecated and
convert all users to system[_freezable]_wq.

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant, so there's no reason to use system_nrt[_freezable]_wq.
Please use system[_freezable]_wq instead.

This patch doesn't make any functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-By: NLai Jiangshan <laijs@cn.fujitsu.com>

Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Airlie <airlied@linux.ie>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>

3b07e9ca

workqueue: deprecate flush[_delayed]_work_sync() · 43829731

由 Tejun Heo 提交于 8月 20, 2012

flush[_delayed]_work_sync() are now spurious.  Mark them deprecated
and convert all users to flush[_delayed]_work().

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant and the regular flushes guarantee that the work item is
not pending or running on any CPU on return, so there's no reason to
use the sync flushes at all and they're going away.

This patch doesn't make any functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mattia Dongili <malattia@linux.it>
Cc: Kent Yoder <key@linux.vnet.ibm.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: Bryan Wu <bryan.wu@canonical.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: Anton Vorontsov <cbou@mail.ru>
Cc: Sangbeom Kim <sbkim73@samsung.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Petr Vandrovec <petr@vandrovec.name>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Avi Kivity <avi@redhat.com>

43829731

workqueue: gut system_nrt[_freezable]_wq() · ae930e0f

由 Tejun Heo 提交于 8月 20, 2012

Now that all workqueues are non-reentrant, system[_freezable]_wq() are
equivalent to system_nrt[_freezable]_wq().  Replace the latter with
wrappers around system[_freezable]_wq().  The wrapping goes through
inline functions so that __deprecated can be added easily.
Signed-off-by: NTejun Heo <tj@kernel.org>

ae930e0f

workqueue: gut flush[_delayed]_work_sync() · 606a5020

由 Tejun Heo 提交于 8月 20, 2012

Now that all workqueues are non-reentrant, flush[_delayed]_work_sync()
are equivalent to flush[_delayed]_work().  Drop the separate
implementation and make them thin wrappers around
flush[_delayed]_work().

* start_flush_work() no longer takes @wait_executing as the only left
  user - flush_work() - always sets it to %true.

* __cancel_work_timer() uses flush_work() instead of wait_on_work().
Signed-off-by: NTejun Heo <tj@kernel.org>

606a5020

14 8月, 2012 1 次提交

workqueue: fix CPU binding of flush_delayed_work[_sync]() · 1265057f

由 Tejun Heo 提交于 8月 08, 2012

delayed_work encodes the workqueue to use and the last CPU in
delayed_work->work.data while it's on timer.  The target CPU is
implicitly recorded as the CPU the timer is queued on and
delayed_work_timer_fn() queues delayed_work->work to the CPU it is
running on.

Unfortunately, this leaves flush_delayed_work[_sync]() no way to find
out which CPU the delayed_work was queued for when they try to
re-queue after killing the timer.  Currently, it chooses the local CPU
flush is running on.  This can unexpectedly move a delayed_work queued
on a specific CPU to another CPU and lead to subtle errors.

There isn't much point in trying to save several bytes in struct
delayed_work, which is already close to a hundred bytes on 64bit with
all debug options turned off.  This patch adds delayed_work->cpu to
remember the CPU it's queued for.

Note that if the timer is migrated during CPU down, the work item
could be queued to the downed global_cwq after this change.  As a
detached global_cwq behaves like an unbound one, this doesn't change
much for the delayed_work.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

1265057f

04 8月, 2012 6 次提交

workqueue: implement mod_delayed_work[_on]() · 8376fe22

由 Tejun Heo 提交于 8月 03, 2012

Workqueue was lacking a mechanism to modify the timeout of an already
pending delayed_work.  delayed_work users have been working around
this using several methods - using an explicit timer + work item,
messing directly with delayed_work->timer, and canceling before
re-queueing, all of which are error-prone and/or ugly.

This patch implements mod_delayed_work[_on]() which behaves similarly
to mod_timer() - if the delayed_work is idle, it's queued with the
given delay; otherwise, its timeout is modified to the new value.
Zero @delay guarantees immediate execution.

v2: Updated to reflect try_to_grab_pending() changes.  Now safe to be
    called from bh context.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>

8376fe22

workqueue: mark a work item being canceled as such · bbb68dfa

由 Tejun Heo 提交于 8月 03, 2012

There can be two reasons try_to_grab_pending() can fail with -EAGAIN.
One is when someone else is queueing or deqeueing the work item.  With
the previous patches, it is guaranteed that PENDING and queued state
will soon agree making it safe to busy-retry in this case.

The other is if multiple __cancel_work_timer() invocations are racing
one another.  __cancel_work_timer() grabs PENDING and then waits for
running instances of the target work item on all CPUs while holding
PENDING and !queued.  try_to_grab_pending() invoked from another task
will keep returning -EAGAIN while the current owner is waiting.

Not distinguishing the two cases is okay because __cancel_work_timer()
is the only user of try_to_grab_pending() and it invokes
wait_on_work() whenever grabbing fails.  For the first case, busy
looping should be fine but wait_on_work() doesn't cause any critical
problem.  For the latter case, the new contender usually waits for the
same condition as the current owner, so no unnecessarily extended
busy-looping happens.  Combined, these make __cancel_work_timer()
technically correct even without irq protection while grabbing PENDING
or distinguishing the two different cases.

While the current code is technically correct, not distinguishing the
two cases makes it difficult to use try_to_grab_pending() for other
purposes than canceling because it's impossible to tell whether it's
safe to busy-retry grabbing.

This patch adds a mechanism to mark a work item being canceled.
try_to_grab_pending() now disables irq on success and returns -EAGAIN
to indicate that grabbing failed but PENDING and queued states are
gonna agree soon and it's safe to busy-loop.  It returns -ENOENT if
the work item is being canceled and it may stay PENDING && !queued for
arbitrary amount of time.

__cancel_work_timer() is modified to mark the work canceling with
WORK_OFFQ_CANCELING after grabbing PENDING, thus making
try_to_grab_pending() fail with -ENOENT instead of -EAGAIN.  Also, it
invokes wait_on_work() iff grabbing failed with -ENOENT.  This isn't
necessary for correctness but makes it consistent with other future
users of try_to_grab_pending().

v2: try_to_grab_pending() was testing preempt_count() to ensure that
    the caller has disabled preemption.  This triggers spuriously if
    !CONFIG_PREEMPT_COUNT.  Use preemptible() instead.  Reported by
    Fengguang Wu.

v3: Updated so that try_to_grab_pending() disables irq on success
    rather than requiring preemption disabled by the caller.  This
    makes busy-looping easier and will allow try_to_grap_pending() to
    be used from bh/irq contexts.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>

bbb68dfa

workqueue: introduce WORK_OFFQ_FLAG_* · b5490077

由 Tejun Heo 提交于 8月 03, 2012

Low WORK_STRUCT_FLAG_BITS bits of work_struct->data contain
WORK_STRUCT_FLAG_* and flush color.  If the work item is queued, the
rest point to the cpu_workqueue with WORK_STRUCT_CWQ set; otherwise,
WORK_STRUCT_CWQ is clear and the bits contain the last CPU number -
either a real CPU number or one of WORK_CPU_*.

Scheduled addition of mod_delayed_work[_on]() requires an additional
flag, which is used only while a work item is off queue.  There are
more than enough bits to represent off-queue CPU number on both 32 and
64bits.  This patch introduces WORK_OFFQ_FLAG_* which occupy the lower
part of the @work->data high bits while off queue.  This patch doesn't
define any actual OFFQ flag yet.

Off-queue CPU number is now shifted by WORK_OFFQ_CPU_SHIFT, which adds
the number of bits used by OFFQ flags to WORK_STRUCT_FLAG_SHIFT, to
make room for OFFQ flags.

To avoid shift width warning with large WORK_OFFQ_FLAG_BITS, ulong
cast is added to WORK_STRUCT_NO_CPU and, just in case, BUILD_BUG_ON()
to check that there are enough bits to accomodate off-queue CPU number
is added.

This patch doesn't make any functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>

b5490077

workqueue: set delayed_work->timer function on initialization · d8e794df

由 Tejun Heo 提交于 8月 03, 2012

delayed_work->timer.function is currently initialized during
queue_delayed_work_on().  Export delayed_work_timer_fn() and set
delayed_work timer function during delayed_work initialization
together with other fields.

This ensures the timer function is always valid on an initialized
delayed_work.  This is to help mod_delayed_work() implementation.

To detect delayed_work users which diddle with the internal timer,
trigger WARN if timer function doesn't match on queue.
Signed-off-by: NTejun Heo <tj@kernel.org>

d8e794df

workqueue: make queueing functions return bool · d4283e93

由 Tejun Heo 提交于 8月 03, 2012

All queueing functions return 1 on success, 0 if the work item was
already pending.  Update them to return bool instead.  This signifies
better that they don't return 0 / -errno.

This is cleanup and doesn't cause any functional difference.

While at it, fix comment opening for schedule_work_on().
Signed-off-by: NTejun Heo <tj@kernel.org>

d4283e93

workqueue: reorder queueing functions so that _on() variants are on top · 0a13c00e

由 Tejun Heo 提交于 8月 03, 2012

Currently, queue/schedule[_delayed]_work_on() are located below the
counterpart without the _on postifx even though the latter is usually
implemented using the former.  Swap them.

This is cleanup and doesn't cause any functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>

0a13c00e

02 3月, 2012 1 次提交

Block: use a freezable workqueue for disk-event polling · 62d3c543

由 Alan Stern 提交于 3月 02, 2012

This patch (as1519) fixes a bug in the block layer's disk-events
polling.  The polling is done by a work routine queued on the
system_nrt_wq workqueue.  Since that workqueue isn't freezable, the
polling continues even in the middle of a system sleep transition.

Obviously, polling a suspended drive for media changes and such isn't
a good thing to do; in the case of USB mass-storage devices it can
lead to real problems requiring device resets and even re-enumeration.

The patch fixes things by creating a new system-wide, non-reentrant,
freezable workqueue and using it for disk-events polling.
Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
CC: <stable@kernel.org>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

62d3c543

11 1月, 2012 1 次提交

workqueue: make alloc_workqueue() take printf fmt and args for name · b196be89

由 Tejun Heo 提交于 1月 10, 2012

alloc_workqueue() currently expects the passed in @name pointer to remain
accessible.  This is inconvenient and a bit silly given that the whole wq
is being dynamically allocated.  This patch updates alloc_workqueue() and
friends to take printf format string instead of opaque string and matching
varargs at the end.  The name is allocated together with the wq and
formatted.

alloc_ordered_workqueue() is converted to a macro to unify varargs
handling with alloc_workqueue(), and, while at it, add comment to
alloc_workqueue().

None of the current in-kernel users pass in string with '%' as constant
name and this change shouldn't cause any problem.

[akpm@linux-foundation.org: use __printf]
Signed-off-by: NTejun Heo <tj@kernel.org>
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b196be89

27 7月, 2011 1 次提交

atomic: use <linux/atomic.h> · 60063497

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

20 5月, 2011 2 次提交

workqueue: separate out drain_workqueue() from destroy_workqueue() · 9c5a2ba7

由 Tejun Heo 提交于 4月 05, 2011

There are users which want to drain workqueues without destroying it.
Separate out drain functionality from destroy_workqueue() into
drain_workqueue() and make it accessible to workqueue users.

To guarantee forward-progress, only chain queueing is allowed while
drain is in progress.  If a new work item which isn't chained from the
running or pending work items is queued while draining is in progress,
WARN_ON_ONCE() is triggered.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>

9c5a2ba7

workqueue: remove cancel_rearming_delayed_work[queue]() · 2543a871

由 Amerigo Wang 提交于 4月 06, 2011

cancel_rearming_delayed_work() and cancel_rearming_delayed_workqueue()
can be removed now.
Signed-off-by: NWANG Cong <amwang@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

2543a871

21 2月, 2011 1 次提交

workqueue: fix build failure introduced by s/freezeable/freezable/ · 24d51add

由 Tejun Heo 提交于 2月 21, 2011

wq:fixes-2.6.38 does s/WQ_FREEZEABLE/WQ_FREEZABLE and wq:for-2.6.39
adds new usage of the flag.  The combination of the two creates a
build failure after merge.  Fix it by renaming all freezeables to
freezables.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>

24d51add

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功