1. 18 5月, 2018 4 次提交
    • T
      workqueue: Show the latest workqueue name in /proc/PID/{comm,stat,status} · 6b59808b
      Tejun Heo 提交于
      There can be a lot of workqueue workers and they all show up with the
      cryptic kworker/* names making it difficult to understand which is
      doing what and how they came to be.
      
        # ps -ef | grep kworker
        root           4       2  0 Feb25 ?        00:00:00 [kworker/0:0H]
        root           6       2  0 Feb25 ?        00:00:00 [kworker/u112:0]
        root          19       2  0 Feb25 ?        00:00:00 [kworker/1:0H]
        root          25       2  0 Feb25 ?        00:00:00 [kworker/2:0H]
        root          31       2  0 Feb25 ?        00:00:00 [kworker/3:0H]
        ...
      
      This patch makes workqueue workers report the latest workqueue it was
      executing for through /proc/PID/{comm,stat,status}.  The extra
      information is appended to the kthread name with intervening '+' if
      currently executing, otherwise '-'.
      
        # cat /proc/25/comm
        kworker/2:0-events_power_efficient
        # cat /proc/25/stat
        25 (kworker/2:0-events_power_efficient) I 2 0 0 0 -1 69238880 0 0...
        # grep Name /proc/25/status
        Name:   kworker/2:0-events_power_efficient
      
      Unfortunately, ps(1) truncates comm to 15 characters,
      
        # ps 25
          PID TTY      STAT   TIME COMMAND
           25 ?        I      0:00 [kworker/2:0-eve]
      
      making it a lot less useful; however, this should be an easy fix from
      ps(1) side.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Craig Small <csmall@enc.com.au>
      6b59808b
    • T
      workqueue: Set worker->desc to workqueue name by default · 8bf89593
      Tejun Heo 提交于
      Work functions can use set_worker_desc() to improve the visibility of
      what the worker task is doing.  Currently, the desc field is unset at
      the beginning of each execution and there is a separate field to track
      the field is set during the current execution.
      
      Instead of leaving empty till desc is set, worker->desc can be used to
      remember the last workqueue the worker worked on by default and users
      that use set_worker_desc() can override it to something more
      informative as necessary.
      
      This simplifies desc handling and helps tracking the last workqueue
      that the worker exected on to improve visibility.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      8bf89593
    • T
      workqueue: Make worker_attach/detach_pool() update worker->pool · a2d812a2
      Tejun Heo 提交于
      For historical reasons, the worker attach/detach functions don't
      currently manage worker->pool and the callers are manually and
      inconsistently updating it.
      
      This patch moves worker->pool updates into the worker attach/detach
      functions.  This makes worker->pool consistent and clearly defines how
      worker->pool updates are synchronized.
      
      This will help later workqueue visibility improvements by allowing
      safe access to workqueue information from worker->task.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a2d812a2
    • T
      workqueue: Replace pool->attach_mutex with global wq_pool_attach_mutex · 1258fae7
      Tejun Heo 提交于
      To improve workqueue visibility, we want to be able to access
      workqueue information from worker tasks.  The per-pool attach mutex
      makes that difficult because there's no way of stabilizing task ->
      worker pool association without knowing the pool first.
      
      Worker attach/detach is a slow path and there's no need for different
      pools to be able to perform them concurrently.  This patch replaces
      the per-pool attach_mutex with global wq_pool_attach_mutex to prepare
      for visibility improvement changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1258fae7
  2. 21 3月, 2018 2 次提交
  3. 20 3月, 2018 1 次提交
    • T
      RCU, workqueue: Implement rcu_work · 05f0fe6b
      Tejun Heo 提交于
      There are cases where RCU callback needs to be bounced to a sleepable
      context.  This is currently done by the RCU callback queueing a work
      item, which can be cumbersome to write and confusing to read.
      
      This patch introduces rcu_work, a workqueue work variant which gets
      executed after a RCU grace period, and converts the open coded
      bouncing in fs/aio and kernel/cgroup.
      
      v3: Dropped queue_rcu_work_on().  Documented rcu grace period behavior
          after queue_rcu_work().
      
      v2: Use rcu_barrier() instead of synchronize_rcu() to wait for
          completion of previously queued rcu callback as per Paul.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      05f0fe6b
  4. 14 3月, 2018 2 次提交
  5. 21 2月, 2018 1 次提交
  6. 17 2月, 2018 1 次提交
  7. 15 1月, 2018 1 次提交
    • N
      staging: lustre: lnet: convert selftest to use workqueues · 6106c0f8
      NeilBrown 提交于
      Instead of the cfs workitem library, use workqueues.
      
      As lnet wants to provide a cpu mask of allowed cpus, it
      needs to be a WQ_UNBOUND work queue so that tasks can
      run on cpus other than where they were submitted.
      
      This patch also exported apply_workqueue_attrs() which is
      a documented part of the workqueue API, that isn't currently
      exported.  lustre needs it to allow workqueue thread to be limited
      to a subset of CPUs.
      
      Acked-by: Tejun Heo <tj@kernel.org> (for export of apply_workqueue_attrs)
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6106c0f8
  8. 13 1月, 2018 1 次提交
  9. 08 1月, 2018 2 次提交
    • T
      workqueue: allow WQ_MEM_RECLAIM on early init workqueues · 40c17f75
      Tejun Heo 提交于
      Workqueues can be created early during boot before workqueue subsystem
      in fully online - work items are queued waiting for later full
      initialization.  However, early init wasn't supported for
      WQ_MEM_RECLAIM workqueues causing unnecessary annoyances for a subset
      of users.  Expand early init support to include WQ_MEM_RECLAIM
      workqueues.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      40c17f75
    • T
      workqueue: separate out init_rescuer() · 983c7515
      Tejun Heo 提交于
      Separate out init_rescuer() from __alloc_workqueue_key() to prepare
      for early init support for WQ_MEM_RECLAIM.  This patch doesn't
      introduce any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      983c7515
  10. 11 12月, 2017 1 次提交
  11. 05 12月, 2017 3 次提交
  12. 28 11月, 2017 1 次提交
  13. 22 11月, 2017 1 次提交
    • K
      treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts · 841b86f3
      Kees Cook 提交于
      With all callbacks converted, and the timer callback prototype
      switched over, the TIMER_FUNC_TYPE cast is no longer needed,
      so remove it. Conversion was done with the following scripts:
      
          perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \
              $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u)
      
          perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \
              $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u)
      
      The now unused macros are also dropped from include/linux/timer.h.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      841b86f3
  14. 08 11月, 2017 1 次提交
  15. 06 11月, 2017 1 次提交
  16. 03 11月, 2017 1 次提交
  17. 25 10月, 2017 2 次提交
    • B
      workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes · fd1a5b04
      Byungchul Park 提交于
      The workqueue code added manual lock acquisition annotations to catch
      deadlocks.
      
      After lockdepcrossrelease was introduced, some of those became redundant,
      since wait_for_completion() already does the acquisition and tracking.
      
      Remove the duplicate annotations.
      Signed-off-by: NByungchul Park <byungchul.park@lge.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: amir73il@gmail.com
      Cc: axboe@kernel.dk
      Cc: darrick.wong@oracle.com
      Cc: david@fromorbit.com
      Cc: hch@infradead.org
      Cc: idryomov@gmail.com
      Cc: johan@kernel.org
      Cc: johannes.berg@intel.com
      Cc: kernel-team@lge.com
      Cc: linux-block@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: linux-xfs@vger.kernel.org
      Cc: oleg@redhat.com
      Cc: tj@kernel.org
      Link: http://lkml.kernel.org/r/1508921765-15396-9-git-send-email-byungchul.park@lge.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fd1a5b04
    • M
      locking/atomics, workqueue: Convert ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE() · c95491ed
      Mark Rutland 提交于
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't currently harmful.
      
      However, for some features it is necessary to instrument reads and
      writes separately, which is not possible with ACCESS_ONCE(). This
      distinction is critical to correct operation.
      
      It's possible to transform the bulk of kernel code using the Coccinelle
      script below. However, this doesn't handle comments, leaving references
      to ACCESS_ONCE() instances which have been removed. As a preparatory
      step, this patch converts the workqueue code and comments to use
      {READ,WRITE}_ONCE() consistently.
      
      ----
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-12-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c95491ed
  18. 22 10月, 2017 1 次提交
  19. 18 10月, 2017 1 次提交
  20. 10 10月, 2017 1 次提交
    • T
      workqueue: replace pool->manager_arb mutex with a flag · 692b4825
      Tejun Heo 提交于
      Josef reported a HARDIRQ-safe -> HARDIRQ-unsafe lock order detected by
      lockdep:
      
       [ 1270.472259] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
       [ 1270.472783] 4.14.0-rc1-xfstests-12888-g76833e8 #110 Not tainted
       [ 1270.473240] -----------------------------------------------------
       [ 1270.473710] kworker/u5:2/5157 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
       [ 1270.474239]  (&(&lock->wait_lock)->rlock){+.+.}, at: [<ffffffff8da253d2>] __mutex_unlock_slowpath+0xa2/0x280
       [ 1270.474994]
       [ 1270.474994] and this task is already holding:
       [ 1270.475440]  (&pool->lock/1){-.-.}, at: [<ffffffff8d2992f6>] worker_thread+0x366/0x3c0
       [ 1270.476046] which would create a new lock dependency:
       [ 1270.476436]  (&pool->lock/1){-.-.} -> (&(&lock->wait_lock)->rlock){+.+.}
       [ 1270.476949]
       [ 1270.476949] but this new dependency connects a HARDIRQ-irq-safe lock:
       [ 1270.477553]  (&pool->lock/1){-.-.}
       ...
       [ 1270.488900] to a HARDIRQ-irq-unsafe lock:
       [ 1270.489327]  (&(&lock->wait_lock)->rlock){+.+.}
       ...
       [ 1270.494735]  Possible interrupt unsafe locking scenario:
       [ 1270.494735]
       [ 1270.495250]        CPU0                    CPU1
       [ 1270.495600]        ----                    ----
       [ 1270.495947]   lock(&(&lock->wait_lock)->rlock);
       [ 1270.496295]                                local_irq_disable();
       [ 1270.496753]                                lock(&pool->lock/1);
       [ 1270.497205]                                lock(&(&lock->wait_lock)->rlock);
       [ 1270.497744]   <Interrupt>
       [ 1270.497948]     lock(&pool->lock/1);
      
      , which will cause a irq inversion deadlock if the above lock scenario
      happens.
      
      The root cause of this safe -> unsafe lock order is the
      mutex_unlock(pool->manager_arb) in manage_workers() with pool->lock
      held.
      
      Unlocking mutex while holding an irq spinlock was never safe and this
      problem has been around forever but it never got noticed because the
      only time the mutex is usually trylocked while holding irqlock making
      actual failures very unlikely and lockdep annotation missed the
      condition until the recent b9c16a0e ("locking/mutex: Fix
      lockdep_assert_held() fail").
      
      Using mutex for pool->manager_arb has always been a bit of stretch.
      It primarily is an mechanism to arbitrate managership between workers
      which can easily be done with a pool flag.  The only reason it became
      a mutex is that pool destruction path wants to exclude parallel
      managing operations.
      
      This patch replaces the mutex with a new pool flag POOL_MANAGER_ACTIVE
      and make the destruction path wait for the current manager on a wait
      queue.
      
      v2: Drop unnecessary flag clearing before pool destruction as
          suggested by Boqun.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NLai Jiangshan <jiangshanlai@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: stable@vger.kernel.org
      692b4825
  21. 05 10月, 2017 2 次提交
    • K
      workqueue: Convert callback to use from_timer() · 8c20feb6
      Kees Cook 提交于
      In preparation for unconditionally passing the struct timer_list pointer
      to all timer callbacks, switch workqueue to use from_timer() and pass the
      timer pointer explicitly.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: linux-mips@linux-mips.org
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Sebastian Reichel <sre@kernel.org>
      Cc: Kalle Valo <kvalo@qca.qualcomm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: linux1394-devel@lists.sourceforge.net
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-wireless@vger.kernel.org
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: Wim Van Sebroeck <wim@iguana.be>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Harish Patil <harish.patil@cavium.com>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Manish Chopra <manish.chopra@cavium.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-pm@vger.kernel.org
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Mark Gross <mark.gross@intel.com>
      Cc: linux-watchdog@vger.kernel.org
      Cc: linux-scsi@vger.kernel.org
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Michael Reed <mdr@sgi.com>
      Cc: netdev@vger.kernel.org
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
      Link: https://lkml.kernel.org/r/1507159627-127660-14-git-send-email-keescook@chromium.org
      8c20feb6
    • K
      timer: Remove users of TIMER_DEFERRED_INITIALIZER · 5cd79d6a
      Kees Cook 提交于
      This removes uses of TIMER_DEFERRED_INITIALIZER and chooses a location
      to call timer_setup() from before add_timer() or mod_timer() is called.
      Adjusts callbacks to use from_timer() as needed.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: linux-mips@linux-mips.org
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Sebastian Reichel <sre@kernel.org>
      Cc: Kalle Valo <kvalo@qca.qualcomm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: linux1394-devel@lists.sourceforge.net
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-wireless@vger.kernel.org
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: Wim Van Sebroeck <wim@iguana.be>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Harish Patil <harish.patil@cavium.com>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Manish Chopra <manish.chopra@cavium.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-pm@vger.kernel.org
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Mark Gross <mark.gross@intel.com>
      Cc: linux-watchdog@vger.kernel.org
      Cc: linux-scsi@vger.kernel.org
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Michael Reed <mdr@sgi.com>
      Cc: netdev@vger.kernel.org
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
      Link: https://lkml.kernel.org/r/1507159627-127660-7-git-send-email-keescook@chromium.org
      5cd79d6a
  22. 29 8月, 2017 1 次提交
  23. 25 8月, 2017 2 次提交
    • P
      locking/lockdep: Fix workqueue crossrelease annotation · e6f3faa7
      Peter Zijlstra 提交于
      The new completion/crossrelease annotations interact unfavourable with
      the extant flush_work()/flush_workqueue() annotations.
      
      The problem is that when a single work class does:
      
        wait_for_completion(&C)
      
      and
      
        complete(&C)
      
      in different executions, we'll build dependencies like:
      
        lock_map_acquire(W)
        complete_acquire(C)
      
      and
      
        lock_map_acquire(W)
        complete_release(C)
      
      which results in the dependency chain: W->C->W, which lockdep thinks
      spells deadlock, even though there is no deadlock potential since
      works are ran concurrently.
      
      One possibility would be to change the work 'lock' to recursive-read,
      but that would mean hitting a lockdep limitation on recursive locks.
      Also, unconditinoally switching to recursive-read here would fail to
      detect the actual deadlock on single-threaded workqueues, which do
      have a problem with this.
      
      For now, forcefully disregard these locks for crossrelease.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boqun.feng@gmail.com
      Cc: byungchul.park@lge.com
      Cc: david@fromorbit.com
      Cc: johannes@sipsolutions.net
      Cc: oleg@redhat.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e6f3faa7
    • P
      workqueue/lockdep: 'Fix' flush_work() annotation · a1d14934
      Peter Zijlstra 提交于
      The flush_work() annotation as introduced by commit:
      
        e159489b ("workqueue: relax lockdep annotation on flush_work()")
      
      hits on the lockdep problem with recursive read locks.
      
      The situation as described is:
      
      Work W1:                Work W2:        Task:
      
      ARR(Q)                  ARR(Q)		flush_workqueue(Q)
      A(W1)                   A(W2)             A(Q)
        flush_work(W2)			  R(Q)
          A(W2)
          R(W2)
          if (special)
            A(Q)
          else
            ARR(Q)
          R(Q)
      
      where: A - acquire, ARR - acquire-read-recursive, R - release.
      
      Where under 'special' conditions we want to trigger a lock recursion
      deadlock, but otherwise allow the flush_work(). The allowing is done
      by using recursive read locks (ARR), but lockdep is broken for
      recursive stuff.
      
      However, there appears to be no need to acquire the lock if we're not
      'special', so if we remove the 'else' clause things become much
      simpler and no longer need the recursion thing at all.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boqun.feng@gmail.com
      Cc: byungchul.park@lge.com
      Cc: david@fromorbit.com
      Cc: johannes@sipsolutions.net
      Cc: oleg@redhat.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a1d14934
  24. 23 8月, 2017 1 次提交
  25. 17 8月, 2017 1 次提交
    • B
      locking/lockdep: Explicitly initialize wq_barrier::done::map · 52fa5bc5
      Boqun Feng 提交于
      With the new lockdep crossrelease feature, which checks completions usage,
      a false positive is reported in the workqueue code:
      
      > Worker A : acquired of wfc.work -> wait for cpu_hotplug_lock to be released
      > Task   B : acquired of cpu_hotplug_lock -> wait for lock#3 to be released
      > Task   C : acquired of lock#3 -> wait for completion of barr->done
      > (Task C is in lru_add_drain_all_cpuslocked())
      > Worker D : wait for wfc.work to be released -> will complete barr->done
      
      Such a dead lock can not happen because Task C's barr->done and Worker D's
      barr->done can not be the same instance.
      
      The reason of this false positive is we initialize all wq_barrier::done
      at insert_wq_barrier() via init_completion(), which makes them belong to
      the same lock class, therefore, impossible circles are reported.
      
      To fix this, explicitly initialize the lockdep map for wq_barrier::done
      in insert_wq_barrier(), so that the lock class key of wq_barrier::done
      is a subkey of the corresponding work_struct, as a result we won't build
      a dependency between a wq_barrier with a unrelated work, and we can
      differ wq barriers based on the related works, so the false positive
      above is avoided.
      
      Also define the empty lockdep_init_map_crosslock() for !CROSSRELEASE
      to make the code simple and away from unnecessary #ifdefs.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170817094622.12915-1-boqun.feng@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      52fa5bc5
  26. 10 8月, 2017 1 次提交
    • B
      locking/lockdep: Implement the 'crossrelease' feature · b09be676
      Byungchul Park 提交于
      Lockdep is a runtime locking correctness validator that detects and
      reports a deadlock or its possibility by checking dependencies between
      locks. It's useful since it does not report just an actual deadlock but
      also the possibility of a deadlock that has not actually happened yet.
      That enables problems to be fixed before they affect real systems.
      
      However, this facility is only applicable to typical locks, such as
      spinlocks and mutexes, which are normally released within the context in
      which they were acquired. However, synchronization primitives like page
      locks or completions, which are allowed to be released in any context,
      also create dependencies and can cause a deadlock.
      
      So lockdep should track these locks to do a better job. The 'crossrelease'
      implementation makes these primitives also be tracked.
      Signed-off-by: NByungchul Park <byungchul.park@lge.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akpm@linux-foundation.org
      Cc: boqun.feng@gmail.com
      Cc: kernel-team@lge.com
      Cc: kirill@shutemov.name
      Cc: npiggin@gmail.com
      Cc: walken@google.com
      Cc: willy@infradead.org
      Link: http://lkml.kernel.org/r/1502089981-21272-6-git-send-email-byungchul.park@lge.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b09be676
  27. 07 8月, 2017 1 次提交
  28. 28 7月, 2017 1 次提交
    • M
      workqueue: Work around edge cases for calc of pool's cpumask · 1ad0f0a7
      Michael Bringmann 提交于
      There is an underlying assumption/trade-off in many layers of the Linux
      system that CPU <-> node mapping is static.  This is despite the presence
      of features like NUMA and 'hotplug' that support the dynamic addition/
      removal of fundamental system resources like CPUs and memory.  PowerPC
      systems, however, do provide extensive features for the dynamic change
      of resources available to a system.
      
      Currently, there is little or no synchronization protection around the
      updating of the CPU <-> node mapping, and the export/update of this
      information for other layers / modules.  In systems which can change
      this mapping during 'hotplug', like PowerPC, the information is changing
      underneath all layers that might reference it.
      
      This patch attempts to ensure that a valid, usable cpumask attribute
      is used by the workqueue infrastructure when setting up new resource
      pools.  It prevents a crash that has been observed when an 'empty'
      cpumask is passed along to the worker/task scheduling code.  It is
      intended as a temporary workaround until a more fundamental review and
      correction of the issue can be done.
      
      [With additions to the patch provided by Tejun Hao <tj@kernel.org>]
      Signed-off-by: NMichael Bringmann <mwb@linux.vnet.ibm.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1ad0f0a7
  29. 26 7月, 2017 1 次提交
    • T
      workqueue: implicit ordered attribute should be overridable · 0a94efb5
      Tejun Heo 提交于
      5c0338c6 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
      ordered") automatically enabled ordered attribute for unbound
      workqueues w/ max_active == 1.  Because ordered workqueues reject
      max_active and some attribute changes, this implicit ordered mode
      broke cases where the user creates an unbound workqueue w/ max_active
      == 1 and later explicitly changes the related attributes.
      
      This patch distinguishes explicit and implicit ordered setting and
      overrides from attribute changes if implict.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Fixes: 5c0338c6 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
      0a94efb5