提交 · 6b59808bfe482642287ddf3fe9d4cccb10756652 · gsplhtlxg / clone-Linux

18 5月, 2018 4 次提交

workqueue: Show the latest workqueue name in /proc/PID/{comm,stat,status} · 6b59808b

由 Tejun Heo 提交于 5月 18, 2018

There can be a lot of workqueue workers and they all show up with the
cryptic kworker/* names making it difficult to understand which is
doing what and how they came to be.

  # ps -ef | grep kworker
  root           4       2  0 Feb25 ?        00:00:00 [kworker/0:0H]
  root           6       2  0 Feb25 ?        00:00:00 [kworker/u112:0]
  root          19       2  0 Feb25 ?        00:00:00 [kworker/1:0H]
  root          25       2  0 Feb25 ?        00:00:00 [kworker/2:0H]
  root          31       2  0 Feb25 ?        00:00:00 [kworker/3:0H]
  ...

This patch makes workqueue workers report the latest workqueue it was
executing for through /proc/PID/{comm,stat,status}.  The extra
information is appended to the kthread name with intervening '+' if
currently executing, otherwise '-'.

  # cat /proc/25/comm
  kworker/2:0-events_power_efficient
  # cat /proc/25/stat
  25 (kworker/2:0-events_power_efficient) I 2 0 0 0 -1 69238880 0 0...
  # grep Name /proc/25/status
  Name:   kworker/2:0-events_power_efficient

Unfortunately, ps(1) truncates comm to 15 characters,

  # ps 25
    PID TTY      STAT   TIME COMMAND
     25 ?        I      0:00 [kworker/2:0-eve]

making it a lot less useful; however, this should be an easy fix from
ps(1) side.
Signed-off-by: NTejun Heo <tj@kernel.org>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Craig Small <csmall@enc.com.au>

6b59808b

workqueue: Set worker->desc to workqueue name by default · 8bf89593

由 Tejun Heo 提交于 5月 18, 2018

Work functions can use set_worker_desc() to improve the visibility of
what the worker task is doing.  Currently, the desc field is unset at
the beginning of each execution and there is a separate field to track
the field is set during the current execution.

Instead of leaving empty till desc is set, worker->desc can be used to
remember the last workqueue the worker worked on by default and users
that use set_worker_desc() can override it to something more
informative as necessary.

This simplifies desc handling and helps tracking the last workqueue
that the worker exected on to improve visibility.
Signed-off-by: NTejun Heo <tj@kernel.org>

8bf89593

workqueue: Make worker_attach/detach_pool() update worker->pool · a2d812a2

由 Tejun Heo 提交于 5月 18, 2018

For historical reasons, the worker attach/detach functions don't
currently manage worker->pool and the callers are manually and
inconsistently updating it.

This patch moves worker->pool updates into the worker attach/detach
functions.  This makes worker->pool consistent and clearly defines how
worker->pool updates are synchronized.

This will help later workqueue visibility improvements by allowing
safe access to workqueue information from worker->task.
Signed-off-by: NTejun Heo <tj@kernel.org>

a2d812a2

workqueue: Replace pool->attach_mutex with global wq_pool_attach_mutex · 1258fae7

由 Tejun Heo 提交于 5月 18, 2018

To improve workqueue visibility, we want to be able to access
workqueue information from worker tasks.  The per-pool attach mutex
makes that difficult because there's no way of stabilizing task ->
worker pool association without knowing the pool first.

Worker attach/detach is a slow path and there's no need for different
pools to be able to perform them concurrently.  This patch replaces
the per-pool attach_mutex with global wq_pool_attach_mutex to prepare
for visibility improvement changes.
Signed-off-by: NTejun Heo <tj@kernel.org>

1258fae7

21 3月, 2018 2 次提交

workqueue: remove the comment about the old manager_arb mutex · f75da8a8

由 Lai Jiangshan 提交于 3月 20, 2018

The manager_arb mutex doesn't exist any more.
Signed-off-by: NLai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

f75da8a8

workqueue: fix the comments of nr_idle · 5826cc8f

由 Lai Jiangshan 提交于 3月 20, 2018

Since the worker rebinding behavior was refactored, there is
no idle worker off the idle_list now. The comment is outdated
and can be just removed.

It also groups nr_workers and nr_idle together.
Signed-off-by: NLai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

5826cc8f

20 3月, 2018 1 次提交

RCU, workqueue: Implement rcu_work · 05f0fe6b

由 Tejun Heo 提交于 3月 14, 2018

There are cases where RCU callback needs to be bounced to a sleepable
context.  This is currently done by the RCU callback queueing a work
item, which can be cumbersome to write and confusing to read.

This patch introduces rcu_work, a workqueue work variant which gets
executed after a RCU grace period, and converts the open coded
bouncing in fs/aio and kernel/cgroup.

v3: Dropped queue_rcu_work_on().  Documented rcu grace period behavior
    after queue_rcu_work().

v2: Use rcu_barrier() instead of synchronize_rcu() to wait for
    completion of previously queued rcu callback as per Paul.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>

05f0fe6b

14 3月, 2018 2 次提交

workqueue: remove unused cancel_work() · 6417250d

由 Stephen Hemminger 提交于 3月 06, 2018

Found this by accident.
There are no usages of bare cancel_work() in current kernel source.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NTejun Heo <tj@kernel.org>

6417250d

workqueue: use put_device() instead of kfree() · 537f4146

由 Arvind Yadav 提交于 3月 06, 2018

Never directly free @dev after calling device_register(), even
if it returned an error! Always use put_device() to give up the
reference initialized in this function instead.
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

537f4146

21 2月, 2018 1 次提交

sched/isolation: Isolate workqueues when "nohz_full=" is set · 1bda3f80

由 Frederic Weisbecker 提交于 2月 21, 2018

As we prepare for offloading the residual 1hz scheduler ticks to
workqueue, let's affine those to housekeepers so that they don't
interrupt the CPUs that don't want to be disturbed.
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Link: http://lkml.kernel.org/r/1519186649-3242-5-git-send-email-frederic@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

1bda3f80

17 2月, 2018 1 次提交

workqueue: Allow retrieval of current task's work struct · 27d4ee03

由 Lukas Wunner 提交于 2月 11, 2018

Introduce a helper to retrieve the current task's work struct if it is
a workqueue worker.

This allows us to fix a long-standing deadlock in several DRM drivers
wherein the ->runtime_suspend callback waits for a specific worker to
finish and that worker in turn calls a function which waits for runtime
suspend to finish.  That function is invoked from multiple call sites
and waiting for runtime suspend to finish is the correct thing to do
except if it's executing in the context of the worker.

Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NLyude Paul <lyude@redhat.com>
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/2d8f603074131eb87e588d2b803a71765bd3a2fd.1518338788.git.lukas@wunner.de

27d4ee03

15 1月, 2018 1 次提交

staging: lustre: lnet: convert selftest to use workqueues · 6106c0f8

由 NeilBrown 提交于 1月 11, 2018

Instead of the cfs workitem library, use workqueues.

As lnet wants to provide a cpu mask of allowed cpus, it
needs to be a WQ_UNBOUND work queue so that tasks can
run on cpus other than where they were submitted.

This patch also exported apply_workqueue_attrs() which is
a documented part of the workqueue API, that isn't currently
exported.  lustre needs it to allow workqueue thread to be limited
to a subset of CPUs.

Acked-by: Tejun Heo <tj@kernel.org> (for export of apply_workqueue_attrs)
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6106c0f8

13 1月, 2018 1 次提交

workqueue: avoid hard lockups in show_workqueue_state() · 62635ea8

由 Sergey Senozhatsky 提交于 1月 11, 2018

show_workqueue_state() can print out a lot of messages while being in
atomic context, e.g. sysrq-t -> show_workqueue_state(). If the console
device is slow it may end up triggering NMI hard lockup watchdog.
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v4.5+

62635ea8

08 1月, 2018 2 次提交

workqueue: allow WQ_MEM_RECLAIM on early init workqueues · 40c17f75

由 Tejun Heo 提交于 1月 08, 2018

Workqueues can be created early during boot before workqueue subsystem
in fully online - work items are queued waiting for later full
initialization.  However, early init wasn't supported for
WQ_MEM_RECLAIM workqueues causing unnecessary annoyances for a subset
of users.  Expand early init support to include WQ_MEM_RECLAIM
workqueues.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>

40c17f75

workqueue: separate out init_rescuer() · 983c7515

由 Tejun Heo 提交于 1月 08, 2018

Separate out init_rescuer() from __alloc_workqueue_key() to prepare
for early init support for WQ_MEM_RECLAIM.  This patch doesn't
introduce any functional changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>

983c7515

11 12月, 2017 1 次提交

workqueue: remove unneeded kallsyms include · 01dfee95

由 Sergey Senozhatsky 提交于 12月 08, 2017

The filw was converted from print_symbol() to %pf some time
ago (044c782c "workqueue: fix checkpatch issues").
kallsyms does not seem to be needed anymore.
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

01dfee95

05 12月, 2017 3 次提交

workqueue/hotplug: remove the workaround in rebind_workers() · 62408c1e

由 Lai Jiangshan 提交于 12月 01, 2017

Since the cpu/hotplug refactoring, DOWN_FAILED is never called without
preceding DOWN_PREPARE making the workaround unnecessary.  Remove it.
Signed-off-by: NLai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

62408c1e

workqueue/hotplug: simplify workqueue_offline_cpu() · e8b3f8db

由 Lai Jiangshan 提交于 12月 01, 2017

Since the recent cpu/hotplug refactoring, workqueue_offline_cpu() is
guaranteed to run on the local cpu which is going offline.

This also fixes the following deadlock by removing work item
scheduling and flushing from CPU hotplug path.

 http://lkml.kernel.org/r/1504764252-29091-1-git-send-email-prsood@codeaurora.org

tj: Description update.
Signed-off-by: NLai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

e8b3f8db

workqueue: Eliminate cond_resched_rcu_qs() in favor of cond_resched() · a7e6425e

由 Paul E. McKenney 提交于 10月 24, 2017

Now that cond_resched() also provides RCU quiescent states when
needed, it can be used in place of cond_resched_rcu_qs().  This
commit therefore makes this change.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: NLai Jiangshan <jiangshanlai@gmail.com>

a7e6425e

28 11月, 2017 1 次提交

workqueue: respect isolated cpus when queueing an unbound work · c98a9805

由 Tal Shorer 提交于 11月 03, 2017

Initialize wq_unbound_cpumask to exclude cpus that were isolated by
the cmdline's isolcpus parameter.
Signed-off-by: NTal Shorer <tal.shorer@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

c98a9805

22 11月, 2017 1 次提交

treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts · 841b86f3

由 Kees Cook 提交于 10月 23, 2017

With all callbacks converted, and the timer callback prototype
switched over, the TIMER_FUNC_TYPE cast is no longer needed,
so remove it. Conversion was done with the following scripts:

    perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \
        $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u)

    perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \
        $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u)

The now unused macros are also dropped from include/linux/timer.h.
Signed-off-by: NKees Cook <keescook@chromium.org>

841b86f3

08 11月, 2017 1 次提交

workqueue: Use lockdep to assert IRQs are disabled/enabled · 8e8eb730

由 Frederic Weisbecker 提交于 11月 06, 2017

Use lockdep to check that IRQs are enabled or disabled as expected. This
way the sanity check only shows overhead when concurrency correctness
debug code is enabled.
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: David S . Miller <davem@davemloft.net>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1509980490-4285-4-git-send-email-frederic@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

8e8eb730

06 11月, 2017 1 次提交
- W
  workqueue: Fix comment for unbound workqueue's attrbutes · 9a19b463
  由 Wang Long 提交于 11月 02, 2017
```
Signed-off-by: NWang Long <wanglong19@meituan.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
  9a19b463
03 11月, 2017 1 次提交

Revert "workqueue: respect isolated cpus when queueing an unbound work" · edbfd911

由 Tejun Heo 提交于 11月 03, 2017

This reverts commit b5149873.

It conflicts with the following isolcpus change from the sched branch.

 edb93821 ("sched/isolation: Move isolcpus= handling to the housekeeping code")

Let's revert for now.
Signed-off-by: NTejun Heo <tj@kernel.org>

edbfd911

25 10月, 2017 2 次提交

workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes · fd1a5b04

由 Byungchul Park 提交于 10月 25, 2017

The workqueue code added manual lock acquisition annotations to catch
deadlocks.

After lockdepcrossrelease was introduced, some of those became redundant,
since wait_for_completion() already does the acquisition and tracking.

Remove the duplicate annotations.
Signed-off-by: NByungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: amir73il@gmail.com
Cc: axboe@kernel.dk
Cc: darrick.wong@oracle.com
Cc: david@fromorbit.com
Cc: hch@infradead.org
Cc: idryomov@gmail.com
Cc: johan@kernel.org
Cc: johannes.berg@intel.com
Cc: kernel-team@lge.com
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-xfs@vger.kernel.org
Cc: oleg@redhat.com
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/1508921765-15396-9-git-send-email-byungchul.park@lge.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fd1a5b04

locking/atomics, workqueue: Convert ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE() · c95491ed

由 Mark Rutland 提交于 10月 23, 2017

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't currently harmful.

However, for some features it is necessary to instrument reads and
writes separately, which is not possible with ACCESS_ONCE(). This
distinction is critical to correct operation.

It's possible to transform the bulk of kernel code using the Coccinelle
script below. However, this doesn't handle comments, leaving references
to ACCESS_ONCE() instances which have been removed. As a preparatory
step, this patch converts the workqueue code and comments to use
{READ,WRITE}_ONCE() consistently.

----
virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-12-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c95491ed

22 10月, 2017 1 次提交

workqueue: respect isolated cpus when queueing an unbound work · b5149873

由 Tal Shorer 提交于 10月 21, 2017

Initialize wq_unbound_cpumask to exclude cpus that were isolated by
the cmdline's isolcpus parameter.
Signed-off-by: NTal Shorer <tal.shorer@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

b5149873

18 10月, 2017 1 次提交

workqueue: Convert timers to use timer_setup() (part 2) · 32a6c723

由 Kees Cook 提交于 10月 16, 2017

In preparation for unconditionally passing the struct timer_list pointer
to all timer callbacks, switch to using the new timer_setup() and
from_timer() to pass the timer pointer explicitly. (The prior workqueue
patch missed a few timers.)
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Link: https://lkml.kernel.org/r/20171016225825.GA99101@beastSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

32a6c723

10 10月, 2017 1 次提交

workqueue: replace pool->manager_arb mutex with a flag · 692b4825

由 Tejun Heo 提交于 10月 09, 2017

Josef reported a HARDIRQ-safe -> HARDIRQ-unsafe lock order detected by
lockdep:

 [ 1270.472259] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
 [ 1270.472783] 4.14.0-rc1-xfstests-12888-g76833e8 #110 Not tainted
 [ 1270.473240] -----------------------------------------------------
 [ 1270.473710] kworker/u5:2/5157 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 [ 1270.474239]  (&(&lock->wait_lock)->rlock){+.+.}, at: [<ffffffff8da253d2>] __mutex_unlock_slowpath+0xa2/0x280
 [ 1270.474994]
 [ 1270.474994] and this task is already holding:
 [ 1270.475440]  (&pool->lock/1){-.-.}, at: [<ffffffff8d2992f6>] worker_thread+0x366/0x3c0
 [ 1270.476046] which would create a new lock dependency:
 [ 1270.476436]  (&pool->lock/1){-.-.} -> (&(&lock->wait_lock)->rlock){+.+.}
 [ 1270.476949]
 [ 1270.476949] but this new dependency connects a HARDIRQ-irq-safe lock:
 [ 1270.477553]  (&pool->lock/1){-.-.}
 ...
 [ 1270.488900] to a HARDIRQ-irq-unsafe lock:
 [ 1270.489327]  (&(&lock->wait_lock)->rlock){+.+.}
 ...
 [ 1270.494735]  Possible interrupt unsafe locking scenario:
 [ 1270.494735]
 [ 1270.495250]        CPU0                    CPU1
 [ 1270.495600]        ----                    ----
 [ 1270.495947]   lock(&(&lock->wait_lock)->rlock);
 [ 1270.496295]                                local_irq_disable();
 [ 1270.496753]                                lock(&pool->lock/1);
 [ 1270.497205]                                lock(&(&lock->wait_lock)->rlock);
 [ 1270.497744]   <Interrupt>
 [ 1270.497948]     lock(&pool->lock/1);

, which will cause a irq inversion deadlock if the above lock scenario
happens.

The root cause of this safe -> unsafe lock order is the
mutex_unlock(pool->manager_arb) in manage_workers() with pool->lock
held.

Unlocking mutex while holding an irq spinlock was never safe and this
problem has been around forever but it never got noticed because the
only time the mutex is usually trylocked while holding irqlock making
actual failures very unlikely and lockdep annotation missed the
condition until the recent b9c16a0e ("locking/mutex: Fix
lockdep_assert_held() fail").

Using mutex for pool->manager_arb has always been a bit of stretch.
It primarily is an mechanism to arbitrate managership between workers
which can easily be done with a pool flag.  The only reason it became
a mutex is that pool destruction path wants to exclude parallel
managing operations.

This patch replaces the mutex with a new pool flag POOL_MANAGER_ACTIVE
and make the destruction path wait for the current manager on a wait
queue.

v2: Drop unnecessary flag clearing before pool destruction as
    suggested by Boqun.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NJosef Bacik <josef@toxicpanda.com>
Reviewed-by: NLai Jiangshan <jiangshanlai@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: stable@vger.kernel.org

692b4825

05 10月, 2017 2 次提交

workqueue: Convert callback to use from_timer() · 8c20feb6

由 Kees Cook 提交于 10月 04, 2017

In preparation for unconditionally passing the struct timer_list pointer
to all timer callbacks, switch workqueue to use from_timer() and pass the
timer pointer explicitly.
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: Petr Mladek <pmladek@suse.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: linux1394-devel@lists.sourceforge.net
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: linux-s390@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-pm@vger.kernel.org
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mark Gross <mark.gross@intel.com>
Cc: linux-watchdog@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Michael Reed <mdr@sgi.com>
Cc: netdev@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lkml.kernel.org/r/1507159627-127660-14-git-send-email-keescook@chromium.org

8c20feb6

timer: Remove users of TIMER_DEFERRED_INITIALIZER · 5cd79d6a

由 Kees Cook 提交于 10月 04, 2017

This removes uses of TIMER_DEFERRED_INITIALIZER and chooses a location
to call timer_setup() from before add_timer() or mod_timer() is called.
Adjusts callbacks to use from_timer() as needed.
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: Petr Mladek <pmladek@suse.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: linux1394-devel@lists.sourceforge.net
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: linux-s390@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-pm@vger.kernel.org
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mark Gross <mark.gross@intel.com>
Cc: linux-watchdog@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Michael Reed <mdr@sgi.com>
Cc: netdev@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lkml.kernel.org/r/1507159627-127660-7-git-send-email-keescook@chromium.org

5cd79d6a

29 8月, 2017 1 次提交

locking/lockdep: Untangle xhlock history save/restore from task independence · f52be570

由 Peter Zijlstra 提交于 8月 29, 2017

Where XHLOCK_{SOFT,HARD} are save/restore points in the xhlocks[] to
ensure the temporal IRQ events don't interact with task state, the
XHLOCK_PROC is a fundament different beast that just happens to share
the interface.

The purpose of XHLOCK_PROC is to annotate independent execution inside
one task. For example workqueues, each work should appear to run in its
own 'pristine' 'task'.

Remove XHLOCK_PROC in favour of its own interface to avoid confusion.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boqun.feng@gmail.com
Cc: david@fromorbit.com
Cc: johannes@sipsolutions.net
Cc: kernel-team@lge.com
Cc: oleg@redhat.com
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/20170829085939.ggmb6xiohw67micb@hirez.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

f52be570

25 8月, 2017 2 次提交

locking/lockdep: Fix workqueue crossrelease annotation · e6f3faa7

由 Peter Zijlstra 提交于 8月 23, 2017

The new completion/crossrelease annotations interact unfavourable with
the extant flush_work()/flush_workqueue() annotations.

The problem is that when a single work class does:

  wait_for_completion(&C)

and

  complete(&C)

in different executions, we'll build dependencies like:

  lock_map_acquire(W)
  complete_acquire(C)

and

  lock_map_acquire(W)
  complete_release(C)

which results in the dependency chain: W->C->W, which lockdep thinks
spells deadlock, even though there is no deadlock potential since
works are ran concurrently.

One possibility would be to change the work 'lock' to recursive-read,
but that would mean hitting a lockdep limitation on recursive locks.
Also, unconditinoally switching to recursive-read here would fail to
detect the actual deadlock on single-threaded workqueues, which do
have a problem with this.

For now, forcefully disregard these locks for crossrelease.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boqun.feng@gmail.com
Cc: byungchul.park@lge.com
Cc: david@fromorbit.com
Cc: johannes@sipsolutions.net
Cc: oleg@redhat.com
Signed-off-by: NIngo Molnar <mingo@kernel.org>

e6f3faa7

workqueue/lockdep: 'Fix' flush_work() annotation · a1d14934

由 Peter Zijlstra 提交于 8月 23, 2017

The flush_work() annotation as introduced by commit:

  e159489b ("workqueue: relax lockdep annotation on flush_work()")

hits on the lockdep problem with recursive read locks.

The situation as described is:

Work W1:                Work W2:        Task:

ARR(Q)                  ARR(Q)		flush_workqueue(Q)
A(W1)                   A(W2)             A(Q)
  flush_work(W2)			  R(Q)
    A(W2)
    R(W2)
    if (special)
      A(Q)
    else
      ARR(Q)
    R(Q)

where: A - acquire, ARR - acquire-read-recursive, R - release.

Where under 'special' conditions we want to trigger a lock recursion
deadlock, but otherwise allow the flush_work(). The allowing is done
by using recursive read locks (ARR), but lockdep is broken for
recursive stuff.

However, there appears to be no need to acquire the lock if we're not
'special', so if we remove the 'else' clause things become much
simpler and no longer need the recursion thing at all.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boqun.feng@gmail.com
Cc: byungchul.park@lge.com
Cc: david@fromorbit.com
Cc: johannes@sipsolutions.net
Cc: oleg@redhat.com
Signed-off-by: NIngo Molnar <mingo@kernel.org>

a1d14934

23 8月, 2017 1 次提交

workqueue: Use TASK_IDLE · c5a94a61

由 Peter Zijlstra 提交于 8月 23, 2017

Workqueues don't use signals, it (ab)uses TASK_INTERRUPTIBLE to avoid
increasing the loadavg numbers. We've 'recently' introduced TASK_IDLE
for this case:

  80ed87c8 ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE")

use it.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NTejun Heo <tj@kernel.org>

c5a94a61

17 8月, 2017 1 次提交

locking/lockdep: Explicitly initialize wq_barrier::done::map · 52fa5bc5

由 Boqun Feng 提交于 8月 17, 2017

With the new lockdep crossrelease feature, which checks completions usage,
a false positive is reported in the workqueue code:

> Worker A : acquired of wfc.work -> wait for cpu_hotplug_lock to be released
> Task   B : acquired of cpu_hotplug_lock -> wait for lock#3 to be released
> Task   C : acquired of lock#3 -> wait for completion of barr->done
> (Task C is in lru_add_drain_all_cpuslocked())
> Worker D : wait for wfc.work to be released -> will complete barr->done

Such a dead lock can not happen because Task C's barr->done and Worker D's
barr->done can not be the same instance.

The reason of this false positive is we initialize all wq_barrier::done
at insert_wq_barrier() via init_completion(), which makes them belong to
the same lock class, therefore, impossible circles are reported.

To fix this, explicitly initialize the lockdep map for wq_barrier::done
in insert_wq_barrier(), so that the lock class key of wq_barrier::done
is a subkey of the corresponding work_struct, as a result we won't build
a dependency between a wq_barrier with a unrelated work, and we can
differ wq barriers based on the related works, so the false positive
above is avoided.

Also define the empty lockdep_init_map_crosslock() for !CROSSRELEASE
to make the code simple and away from unnecessary #ifdefs.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170817094622.12915-1-boqun.feng@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

52fa5bc5

10 8月, 2017 1 次提交

locking/lockdep: Implement the 'crossrelease' feature · b09be676

由 Byungchul Park 提交于 8月 07, 2017

Lockdep is a runtime locking correctness validator that detects and
reports a deadlock or its possibility by checking dependencies between
locks. It's useful since it does not report just an actual deadlock but
also the possibility of a deadlock that has not actually happened yet.
That enables problems to be fixed before they affect real systems.

However, this facility is only applicable to typical locks, such as
spinlocks and mutexes, which are normally released within the context in
which they were acquired. However, synchronization primitives like page
locks or completions, which are allowed to be released in any context,
also create dependencies and can cause a deadlock.

So lockdep should track these locks to do a better job. The 'crossrelease'
implementation makes these primitives also be tracked.
Signed-off-by: NByungchul Park <byungchul.park@lge.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: boqun.feng@gmail.com
Cc: kernel-team@lge.com
Cc: kirill@shutemov.name
Cc: npiggin@gmail.com
Cc: walken@google.com
Cc: willy@infradead.org
Link: http://lkml.kernel.org/r/1502089981-21272-6-git-send-email-byungchul.park@lge.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

b09be676

07 8月, 2017 1 次提交
- B
  workqueue: fix path to documentation · 9a261491
  由 Benjamin Peterson 提交于 8月 06, 2017
```
Signed-off-by: NBenjamin Peterson <bp@benjamin.pe>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
  9a261491
28 7月, 2017 1 次提交

workqueue: Work around edge cases for calc of pool's cpumask · 1ad0f0a7

由 Michael Bringmann 提交于 7月 27, 2017

There is an underlying assumption/trade-off in many layers of the Linux
system that CPU <-> node mapping is static.  This is despite the presence
of features like NUMA and 'hotplug' that support the dynamic addition/
removal of fundamental system resources like CPUs and memory.  PowerPC
systems, however, do provide extensive features for the dynamic change
of resources available to a system.

Currently, there is little or no synchronization protection around the
updating of the CPU <-> node mapping, and the export/update of this
information for other layers / modules.  In systems which can change
this mapping during 'hotplug', like PowerPC, the information is changing
underneath all layers that might reference it.

This patch attempts to ensure that a valid, usable cpumask attribute
is used by the workqueue infrastructure when setting up new resource
pools.  It prevents a crash that has been observed when an 'empty'
cpumask is passed along to the worker/task scheduling code.  It is
intended as a temporary workaround until a more fundamental review and
correction of the issue can be done.

[With additions to the patch provided by Tejun Hao <tj@kernel.org>]
Signed-off-by: NMichael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

1ad0f0a7

26 7月, 2017 1 次提交

workqueue: implicit ordered attribute should be overridable · 0a94efb5

由 Tejun Heo 提交于 7月 23, 2017

5c0338c6 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered") automatically enabled ordered attribute for unbound
workqueues w/ max_active == 1.  Because ordered workqueues reject
max_active and some attribute changes, this implicit ordered mode
broke cases where the user creates an unbound workqueue w/ max_active
== 1 and later explicitly changes the related attributes.

This patch distinguishes explicit and implicit ordered setting and
overrides from attribute changes if implict.
Signed-off-by: NTejun Heo <tj@kernel.org>
Fixes: 5c0338c6 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")

0a94efb5