- 21 2月, 2018 1 次提交
-
-
由 Frederic Weisbecker 提交于
As we prepare for offloading the residual 1hz scheduler ticks to workqueue, let's affine those to housekeepers so that they don't interrupt the CPUs that don't want to be disturbed. Signed-off-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Christoph Lameter <cl@linux.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1519186649-3242-5-git-send-email-frederic@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 15 1月, 2018 1 次提交
-
-
由 NeilBrown 提交于
Instead of the cfs workitem library, use workqueues. As lnet wants to provide a cpu mask of allowed cpus, it needs to be a WQ_UNBOUND work queue so that tasks can run on cpus other than where they were submitted. This patch also exported apply_workqueue_attrs() which is a documented part of the workqueue API, that isn't currently exported. lustre needs it to allow workqueue thread to be limited to a subset of CPUs. Acked-by: Tejun Heo <tj@kernel.org> (for export of apply_workqueue_attrs) Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 13 1月, 2018 1 次提交
-
-
由 Sergey Senozhatsky 提交于
show_workqueue_state() can print out a lot of messages while being in atomic context, e.g. sysrq-t -> show_workqueue_state(). If the console device is slow it may end up triggering NMI hard lockup watchdog. Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v4.5+
-
- 08 1月, 2018 2 次提交
-
-
由 Tejun Heo 提交于
Workqueues can be created early during boot before workqueue subsystem in fully online - work items are queued waiting for later full initialization. However, early init wasn't supported for WQ_MEM_RECLAIM workqueues causing unnecessary annoyances for a subset of users. Expand early init support to include WQ_MEM_RECLAIM workqueues. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
-
由 Tejun Heo 提交于
Separate out init_rescuer() from __alloc_workqueue_key() to prepare for early init support for WQ_MEM_RECLAIM. This patch doesn't introduce any functional changes. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
-
- 11 12月, 2017 1 次提交
-
-
由 Sergey Senozhatsky 提交于
The filw was converted from print_symbol() to %pf some time ago (044c782c "workqueue: fix checkpatch issues"). kallsyms does not seem to be needed anymore. Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 05 12月, 2017 3 次提交
-
-
由 Lai Jiangshan 提交于
Since the cpu/hotplug refactoring, DOWN_FAILED is never called without preceding DOWN_PREPARE making the workaround unnecessary. Remove it. Signed-off-by: NLai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Lai Jiangshan 提交于
Since the recent cpu/hotplug refactoring, workqueue_offline_cpu() is guaranteed to run on the local cpu which is going offline. This also fixes the following deadlock by removing work item scheduling and flushing from CPU hotplug path. http://lkml.kernel.org/r/1504764252-29091-1-git-send-email-prsood@codeaurora.org tj: Description update. Signed-off-by: NLai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Paul E. McKenney 提交于
Now that cond_resched() also provides RCU quiescent states when needed, it can be used in place of cond_resched_rcu_qs(). This commit therefore makes this change. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Reviewed-by: NLai Jiangshan <jiangshanlai@gmail.com>
-
- 28 11月, 2017 1 次提交
-
-
由 Tal Shorer 提交于
Initialize wq_unbound_cpumask to exclude cpus that were isolated by the cmdline's isolcpus parameter. Signed-off-by: NTal Shorer <tal.shorer@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 22 11月, 2017 1 次提交
-
-
由 Kees Cook 提交于
With all callbacks converted, and the timer callback prototype switched over, the TIMER_FUNC_TYPE cast is no longer needed, so remove it. Conversion was done with the following scripts: perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \ $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u) perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \ $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u) The now unused macros are also dropped from include/linux/timer.h. Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 08 11月, 2017 1 次提交
-
-
由 Frederic Weisbecker 提交于
Use lockdep to check that IRQs are enabled or disabled as expected. This way the sanity check only shows overhead when concurrency correctness debug code is enabled. Signed-off-by: NFrederic Weisbecker <frederic@kernel.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NTejun Heo <tj@kernel.org> Cc: David S . Miller <davem@davemloft.net> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1509980490-4285-4-git-send-email-frederic@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 11月, 2017 1 次提交
-
-
由 Wang Long 提交于
Signed-off-by: NWang Long <wanglong19@meituan.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 03 11月, 2017 1 次提交
-
- 25 10月, 2017 2 次提交
-
-
由 Byungchul Park 提交于
The workqueue code added manual lock acquisition annotations to catch deadlocks. After lockdepcrossrelease was introduced, some of those became redundant, since wait_for_completion() already does the acquisition and tracking. Remove the duplicate annotations. Signed-off-by: NByungchul Park <byungchul.park@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: amir73il@gmail.com Cc: axboe@kernel.dk Cc: darrick.wong@oracle.com Cc: david@fromorbit.com Cc: hch@infradead.org Cc: idryomov@gmail.com Cc: johan@kernel.org Cc: johannes.berg@intel.com Cc: kernel-team@lge.com Cc: linux-block@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-xfs@vger.kernel.org Cc: oleg@redhat.com Cc: tj@kernel.org Link: http://lkml.kernel.org/r/1508921765-15396-9-git-send-email-byungchul.park@lge.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Mark Rutland 提交于
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't currently harmful. However, for some features it is necessary to instrument reads and writes separately, which is not possible with ACCESS_ONCE(). This distinction is critical to correct operation. It's possible to transform the bulk of kernel code using the Coccinelle script below. However, this doesn't handle comments, leaving references to ACCESS_ONCE() instances which have been removed. As a preparatory step, this patch converts the workqueue code and comments to use {READ,WRITE}_ONCE() consistently. ---- virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-12-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 10月, 2017 1 次提交
-
-
由 Tal Shorer 提交于
Initialize wq_unbound_cpumask to exclude cpus that were isolated by the cmdline's isolcpus parameter. Signed-off-by: NTal Shorer <tal.shorer@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 18 10月, 2017 1 次提交
-
-
由 Kees Cook 提交于
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. (The prior workqueue patch missed a few timers.) Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NTejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Link: https://lkml.kernel.org/r/20171016225825.GA99101@beastSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 10 10月, 2017 1 次提交
-
-
由 Tejun Heo 提交于
Josef reported a HARDIRQ-safe -> HARDIRQ-unsafe lock order detected by lockdep: [ 1270.472259] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected [ 1270.472783] 4.14.0-rc1-xfstests-12888-g76833e8 #110 Not tainted [ 1270.473240] ----------------------------------------------------- [ 1270.473710] kworker/u5:2/5157 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 1270.474239] (&(&lock->wait_lock)->rlock){+.+.}, at: [<ffffffff8da253d2>] __mutex_unlock_slowpath+0xa2/0x280 [ 1270.474994] [ 1270.474994] and this task is already holding: [ 1270.475440] (&pool->lock/1){-.-.}, at: [<ffffffff8d2992f6>] worker_thread+0x366/0x3c0 [ 1270.476046] which would create a new lock dependency: [ 1270.476436] (&pool->lock/1){-.-.} -> (&(&lock->wait_lock)->rlock){+.+.} [ 1270.476949] [ 1270.476949] but this new dependency connects a HARDIRQ-irq-safe lock: [ 1270.477553] (&pool->lock/1){-.-.} ... [ 1270.488900] to a HARDIRQ-irq-unsafe lock: [ 1270.489327] (&(&lock->wait_lock)->rlock){+.+.} ... [ 1270.494735] Possible interrupt unsafe locking scenario: [ 1270.494735] [ 1270.495250] CPU0 CPU1 [ 1270.495600] ---- ---- [ 1270.495947] lock(&(&lock->wait_lock)->rlock); [ 1270.496295] local_irq_disable(); [ 1270.496753] lock(&pool->lock/1); [ 1270.497205] lock(&(&lock->wait_lock)->rlock); [ 1270.497744] <Interrupt> [ 1270.497948] lock(&pool->lock/1); , which will cause a irq inversion deadlock if the above lock scenario happens. The root cause of this safe -> unsafe lock order is the mutex_unlock(pool->manager_arb) in manage_workers() with pool->lock held. Unlocking mutex while holding an irq spinlock was never safe and this problem has been around forever but it never got noticed because the only time the mutex is usually trylocked while holding irqlock making actual failures very unlikely and lockdep annotation missed the condition until the recent b9c16a0e ("locking/mutex: Fix lockdep_assert_held() fail"). Using mutex for pool->manager_arb has always been a bit of stretch. It primarily is an mechanism to arbitrate managership between workers which can easily be done with a pool flag. The only reason it became a mutex is that pool destruction path wants to exclude parallel managing operations. This patch replaces the mutex with a new pool flag POOL_MANAGER_ACTIVE and make the destruction path wait for the current manager on a wait queue. v2: Drop unnecessary flag clearing before pool destruction as suggested by Boqun. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NLai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: stable@vger.kernel.org
-
- 05 10月, 2017 2 次提交
-
-
由 Kees Cook 提交于
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch workqueue to use from_timer() and pass the timer pointer explicitly. Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Cc: Petr Mladek <pmladek@suse.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Sebastian Reichel <sre@kernel.org> Cc: Kalle Valo <kvalo@qca.qualcomm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: linux1394-devel@lists.sourceforge.net Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: linux-s390@vger.kernel.org Cc: linux-wireless@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: Wim Van Sebroeck <wim@iguana.be> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ursula Braun <ubraun@linux.vnet.ibm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Harish Patil <harish.patil@cavium.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Manish Chopra <manish.chopra@cavium.com> Cc: Len Brown <len.brown@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-pm@vger.kernel.org Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Mark Gross <mark.gross@intel.com> Cc: linux-watchdog@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Michael Reed <mdr@sgi.com> Cc: netdev@vger.kernel.org Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lkml.kernel.org/r/1507159627-127660-14-git-send-email-keescook@chromium.org
-
由 Kees Cook 提交于
This removes uses of TIMER_DEFERRED_INITIALIZER and chooses a location to call timer_setup() from before add_timer() or mod_timer() is called. Adjusts callbacks to use from_timer() as needed. Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Cc: Petr Mladek <pmladek@suse.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Sebastian Reichel <sre@kernel.org> Cc: Kalle Valo <kvalo@qca.qualcomm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: linux1394-devel@lists.sourceforge.net Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: linux-s390@vger.kernel.org Cc: linux-wireless@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: Wim Van Sebroeck <wim@iguana.be> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ursula Braun <ubraun@linux.vnet.ibm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Harish Patil <harish.patil@cavium.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Manish Chopra <manish.chopra@cavium.com> Cc: Len Brown <len.brown@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-pm@vger.kernel.org Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Mark Gross <mark.gross@intel.com> Cc: linux-watchdog@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Michael Reed <mdr@sgi.com> Cc: netdev@vger.kernel.org Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lkml.kernel.org/r/1507159627-127660-7-git-send-email-keescook@chromium.org
-
- 29 8月, 2017 1 次提交
-
-
由 Peter Zijlstra 提交于
Where XHLOCK_{SOFT,HARD} are save/restore points in the xhlocks[] to ensure the temporal IRQ events don't interact with task state, the XHLOCK_PROC is a fundament different beast that just happens to share the interface. The purpose of XHLOCK_PROC is to annotate independent execution inside one task. For example workqueues, each work should appear to run in its own 'pristine' 'task'. Remove XHLOCK_PROC in favour of its own interface to avoid confusion. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: david@fromorbit.com Cc: johannes@sipsolutions.net Cc: kernel-team@lge.com Cc: oleg@redhat.com Cc: tj@kernel.org Link: http://lkml.kernel.org/r/20170829085939.ggmb6xiohw67micb@hirez.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 8月, 2017 2 次提交
-
-
由 Peter Zijlstra 提交于
The new completion/crossrelease annotations interact unfavourable with the extant flush_work()/flush_workqueue() annotations. The problem is that when a single work class does: wait_for_completion(&C) and complete(&C) in different executions, we'll build dependencies like: lock_map_acquire(W) complete_acquire(C) and lock_map_acquire(W) complete_release(C) which results in the dependency chain: W->C->W, which lockdep thinks spells deadlock, even though there is no deadlock potential since works are ran concurrently. One possibility would be to change the work 'lock' to recursive-read, but that would mean hitting a lockdep limitation on recursive locks. Also, unconditinoally switching to recursive-read here would fail to detect the actual deadlock on single-threaded workqueues, which do have a problem with this. For now, forcefully disregard these locks for crossrelease. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NTejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: byungchul.park@lge.com Cc: david@fromorbit.com Cc: johannes@sipsolutions.net Cc: oleg@redhat.com Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
The flush_work() annotation as introduced by commit: e159489b ("workqueue: relax lockdep annotation on flush_work()") hits on the lockdep problem with recursive read locks. The situation as described is: Work W1: Work W2: Task: ARR(Q) ARR(Q) flush_workqueue(Q) A(W1) A(W2) A(Q) flush_work(W2) R(Q) A(W2) R(W2) if (special) A(Q) else ARR(Q) R(Q) where: A - acquire, ARR - acquire-read-recursive, R - release. Where under 'special' conditions we want to trigger a lock recursion deadlock, but otherwise allow the flush_work(). The allowing is done by using recursive read locks (ARR), but lockdep is broken for recursive stuff. However, there appears to be no need to acquire the lock if we're not 'special', so if we remove the 'else' clause things become much simpler and no longer need the recursion thing at all. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NTejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: byungchul.park@lge.com Cc: david@fromorbit.com Cc: johannes@sipsolutions.net Cc: oleg@redhat.com Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 23 8月, 2017 1 次提交
-
-
由 Peter Zijlstra 提交于
Workqueues don't use signals, it (ab)uses TASK_INTERRUPTIBLE to avoid increasing the loadavg numbers. We've 'recently' introduced TASK_IDLE for this case: 80ed87c8 ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE") use it. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 17 8月, 2017 1 次提交
-
-
由 Boqun Feng 提交于
With the new lockdep crossrelease feature, which checks completions usage, a false positive is reported in the workqueue code: > Worker A : acquired of wfc.work -> wait for cpu_hotplug_lock to be released > Task B : acquired of cpu_hotplug_lock -> wait for lock#3 to be released > Task C : acquired of lock#3 -> wait for completion of barr->done > (Task C is in lru_add_drain_all_cpuslocked()) > Worker D : wait for wfc.work to be released -> will complete barr->done Such a dead lock can not happen because Task C's barr->done and Worker D's barr->done can not be the same instance. The reason of this false positive is we initialize all wq_barrier::done at insert_wq_barrier() via init_completion(), which makes them belong to the same lock class, therefore, impossible circles are reported. To fix this, explicitly initialize the lockdep map for wq_barrier::done in insert_wq_barrier(), so that the lock class key of wq_barrier::done is a subkey of the corresponding work_struct, as a result we won't build a dependency between a wq_barrier with a unrelated work, and we can differ wq barriers based on the related works, so the false positive above is avoided. Also define the empty lockdep_init_map_crosslock() for !CROSSRELEASE to make the code simple and away from unnecessary #ifdefs. Reported-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NBoqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170817094622.12915-1-boqun.feng@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 8月, 2017 1 次提交
-
-
由 Byungchul Park 提交于
Lockdep is a runtime locking correctness validator that detects and reports a deadlock or its possibility by checking dependencies between locks. It's useful since it does not report just an actual deadlock but also the possibility of a deadlock that has not actually happened yet. That enables problems to be fixed before they affect real systems. However, this facility is only applicable to typical locks, such as spinlocks and mutexes, which are normally released within the context in which they were acquired. However, synchronization primitives like page locks or completions, which are allowed to be released in any context, also create dependencies and can cause a deadlock. So lockdep should track these locks to do a better job. The 'crossrelease' implementation makes these primitives also be tracked. Signed-off-by: NByungchul Park <byungchul.park@lge.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Cc: boqun.feng@gmail.com Cc: kernel-team@lge.com Cc: kirill@shutemov.name Cc: npiggin@gmail.com Cc: walken@google.com Cc: willy@infradead.org Link: http://lkml.kernel.org/r/1502089981-21272-6-git-send-email-byungchul.park@lge.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 07 8月, 2017 1 次提交
-
-
由 Benjamin Peterson 提交于
Signed-off-by: NBenjamin Peterson <bp@benjamin.pe> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 28 7月, 2017 1 次提交
-
-
由 Michael Bringmann 提交于
There is an underlying assumption/trade-off in many layers of the Linux system that CPU <-> node mapping is static. This is despite the presence of features like NUMA and 'hotplug' that support the dynamic addition/ removal of fundamental system resources like CPUs and memory. PowerPC systems, however, do provide extensive features for the dynamic change of resources available to a system. Currently, there is little or no synchronization protection around the updating of the CPU <-> node mapping, and the export/update of this information for other layers / modules. In systems which can change this mapping during 'hotplug', like PowerPC, the information is changing underneath all layers that might reference it. This patch attempts to ensure that a valid, usable cpumask attribute is used by the workqueue infrastructure when setting up new resource pools. It prevents a crash that has been observed when an 'empty' cpumask is passed along to the worker/task scheduling code. It is intended as a temporary workaround until a more fundamental review and correction of the issue can be done. [With additions to the patch provided by Tejun Hao <tj@kernel.org>] Signed-off-by: NMichael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 26 7月, 2017 1 次提交
-
-
由 Tejun Heo 提交于
5c0338c6 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered") automatically enabled ordered attribute for unbound workqueues w/ max_active == 1. Because ordered workqueues reject max_active and some attribute changes, this implicit ordered mode broke cases where the user creates an unbound workqueue w/ max_active == 1 and later explicitly changes the related attributes. This patch distinguishes explicit and implicit ordered setting and overrides from attribute changes if implict. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: 5c0338c6 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
-
- 19 7月, 2017 1 次提交
-
-
由 Tejun Heo 提交于
The combination of WQ_UNBOUND and max_active == 1 used to imply ordered execution. After NUMA affinity 4c16bd32 ("workqueue: implement NUMA affinity for unbound workqueues"), this is no longer true due to per-node worker pools. While the right way to create an ordered workqueue is alloc_ordered_workqueue(), the documentation has been misleading for a long time and people do use WQ_UNBOUND and max_active == 1 for ordered workqueues which can lead to subtle bugs which are very difficult to trigger. It's unlikely that we'd see noticeable performance impact by enforcing ordering on WQ_UNBOUND / max_active == 1 workqueues. Let's automatically set __WQ_ORDERED for those workqueues. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NChristoph Hellwig <hch@infradead.org> Reported-by: NAlexei Potashnik <alexei@purestorage.com> Fixes: 4c16bd32 ("workqueue: implement NUMA affinity for unbound workqueues") Cc: stable@vger.kernel.org # v3.10+
-
- 20 6月, 2017 1 次提交
-
-
由 Ingo Molnar 提交于
Rename: wait_queue_t => wait_queue_entry_t 'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue", but in reality it's a queue *entry*. The 'real' queue is the wait queue head, which had to carry the name. Start sorting this out by renaming it to 'wait_queue_entry_t'. This also allows the real structure name 'struct __wait_queue' to lose its double underscore and become 'struct wait_queue_entry', which is the more canonical nomenclature for such data types. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 15 4月, 2017 1 次提交
-
-
由 Thomas Gleixner 提交于
work_on_cpu() is not protected against CPU hotplug. For code which requires to be either executed on an online CPU or to fail if the CPU is not available the callsite would have to protect against CPU hotplug. Provide a function which does get/put_online_cpus() around the call to work_on_cpu() and fails the call with -ENODEV if the target CPU is not online. Preparatory patch to convert several racy task affinity manipulations. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NTejun Heo <tj@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/20170412201042.262610721@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 07 3月, 2017 2 次提交
-
-
由 Geliang Tang 提交于
Use setup_deferrable_timer() instead of init_timer_deferrable() to simplify the code. Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
If queue_delayed_work() gets called with NULL @wq, the kernel will oops asynchronuosly on timer expiration which isn't too helpful in tracking down the offender. This actually happened with smc. __queue_delayed_work() already does several input sanity checks synchronously. Add NULL @wq check. Reported-by: NDave Jones <davej@codemonkey.org.uk> Link: http://lkml.kernel.org/r/20170227171439.jshx3qplflyrgcv7@codemonkey.org.ukSigned-off-by: NTejun Heo <tj@kernel.org>
-
- 10 2月, 2017 1 次提交
-
-
由 Kees Cook 提交于
Currently CONFIG_TIMER_STATS exposes process information across namespaces: kernel/time/timer_list.c print_timer(): SEQ_printf(m, ", %s/%d", tmp, timer->start_pid); /proc/timer_list: #11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570 Given that the tracer can give the same information, this patch entirely removes CONFIG_TIMER_STATS. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NJohn Stultz <john.stultz@linaro.org> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: linux-doc@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Xing Gao <xgao01@email.wm.edu> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jessica Frazelle <me@jessfraz.com> Cc: kernel-hardening@lists.openwall.com Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Michal Marek <mmarek@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Olof Johansson <olof@lixom.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-api@vger.kernel.org Cc: Arjan van de Ven <arjan@linux.intel.com> Link: http://lkml.kernel.org/r/20170208192659.GA32582@beastSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 20 10月, 2016 1 次提交
-
-
由 Tejun Heo 提交于
While splitting up workqueue initialization into two parts, ac8f73400782 ("workqueue: make workqueue available early during boot") put wq_numa_init() into workqueue_init_early(). Unfortunately, on some archs including power and arm64, cpu to node mapping isn't yet established by the time the early init is called leading to incorrect NUMA initialization and subsequently the following oops due to zero cpumask on node-specific unbound pools. Unable to handle kernel paging request for data at address 0x00000038 Faulting instruction address: 0xc0000000000fc0cc Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-compiler_gcc-6.2.0-next-20161005 #94 task: c0000007f5400000 task.stack: c000001ffc084000 NIP: c0000000000fc0cc LR: c0000000000ed928 CTR: c0000000000fbfd0 REGS: c000001ffc087780 TRAP: 0300 Not tainted (4.8.0-compiler_gcc-6.2.0-next-20161005) MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 48000424 XER: 00000000 CFAR: c0000000000089dc DAR: 0000000000000038 DSISR: 40000000 SOFTE: 0 GPR00: c0000000000ed928 c000001ffc087a00 c000000000e63200 c000000010d6d600 GPR04: c0000007f5409200 0000000000000021 000000000748e08c 000000000000001f GPR08: 0000000000000000 0000000000000021 000000000748f1f8 0000000000000000 GPR12: 0000000028000422 c00000000fb80000 c00000000000e0c8 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000021 0000000000000001 GPR20: ffffffffafb50401 0000000000000000 c000000010d6d600 000000000000ba7e GPR24: 000000000000ba7e c000000000d8bc58 afb504000afb5041 0000000000000001 GPR28: 0000000000000000 0000000000000004 c0000007f5409280 0000000000000000 NIP [c0000000000fc0cc] enqueue_task_fair+0xfc/0x18b0 LR [c0000000000ed928] activate_task+0x78/0xe0 Call Trace: [c000001ffc087a00] [c0000007f5409200] 0xc0000007f5409200 (unreliable) [c000001ffc087b10] [c0000000000ed928] activate_task+0x78/0xe0 [c000001ffc087b50] [c0000000000ede58] ttwu_do_activate+0x68/0xc0 [c000001ffc087b90] [c0000000000ef1b8] try_to_wake_up+0x208/0x4f0 [c000001ffc087c10] [c0000000000d3484] create_worker+0x144/0x250 [c000001ffc087cb0] [c000000000cd72d0] workqueue_init+0x124/0x150 [c000001ffc087d00] [c000000000cc0e74] kernel_init_freeable+0x158/0x360 [c000001ffc087dc0] [c00000000000e0e4] kernel_init+0x24/0x160 [c000001ffc087e30] [c00000000000bfa0] ret_from_kernel_thread+0x5c/0xbc Instruction dump: 62940401 3b800000 3aa00000 7f17c378 3a600001 3b600001 60000000 60000000 60420000 72490021 ebfe0150 2f890001 <ebbf0038> 419e0de0 7fbee840 419e0e58 ---[ end trace 0000000000000000 ]--- Fix it by moving wq_numa_init() to workqueue_init(). As this means that the early intialization may not have full NUMA info for per-cpu pools and ignores NUMA affinity for unbound pools, fix them up from workqueue_init() after wq_numa_init(). Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NMichael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/87twck5wqo.fsf@concordia.ellerman.id.au Fixes: ac8f73400782 ("workqueue: make workqueue available early during boot") Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 12 10月, 2016 1 次提交
-
-
由 Petr Mladek 提交于
Patch series "kthread: Kthread worker API improvements" The intention of this patchset is to make it easier to manipulate and maintain kthreads. Especially, I want to replace all the custom main cycles with a generic one. Also I want to make the kthreads sleep in a consistent state in a common place when there is no work. This patch (of 11): A good practice is to prefix the names of functions by the name of the subsystem. This patch fixes the name of probe_kthread_data(). The other wrong functions names are part of the kthread worker API and will be fixed separately. Link: http://lkml.kernel.org/r/1470754545-17632-2-git-send-email-pmladek@suse.comSigned-off-by: NPetr Mladek <pmladek@suse.com> Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NTejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 9月, 2016 2 次提交
-
-
由 Tejun Heo 提交于
keventd_up() no longer has in-kernel users. Remove it and make wq_online static. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Workqueue is currently initialized in an early init call; however, there are cases where early boot code has to be split and reordered to come after workqueue initialization or the same code path which makes use of workqueues is used both before workqueue initailization and after. The latter cases have to gate workqueue usages with keventd_up() tests, which is nasty and easy to get wrong. Workqueue usages have become widespread and it'd be a lot more convenient if it can be used very early from boot. This patch splits workqueue initialization into two steps. workqueue_init_early() which sets up the basic data structures so that workqueues can be created and work items queued, and workqueue_init() which actually brings up workqueues online and starts executing queued work items. The former step can be done very early during boot once memory allocation, cpumasks and idr are initialized. The latter right after kthreads become available. This allows work item queueing and canceling from very early boot which is what most of these use cases want. * As systemd_wq being initialized doesn't indicate that workqueue is fully online anymore, update keventd_up() to test wq_online instead. The follow-up patches will get rid of all its usages and the function itself. * Flushing doesn't make sense before workqueue is fully initialized. The flush functions trigger WARN and return immediately before fully online. * Work items are never in-flight before fully online. Canceling can always succeed by skipping the flush step. * Some code paths can no longer assume to be called with irq enabled as irq is disabled during early boot. Use irqsave/restore operations instead. v2: Watchdog init, which requires timer to be running, moved from workqueue_init_early() to workqueue_init(). Signed-off-by: NTejun Heo <tj@kernel.org> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/CA+55aFx0vPuMuxn00rBSM192n-Du5uxy+4AvKa0SBSOVJeuCGg@mail.gmail.com
-